verify two reviewer-probe claims: (1) measured lesion spectra REFUTE 'low internal rank' (RankMe 339>307) -> correct attribution to RARITY across papers #1/#2/NEGATIVE_RESULT; (2) verified MedDINOv3/DINOv3=RoPE vs DINOv2=learned-absolute, paper #3 §3 stated precisely
Browse files- gate_reports/NEGATIVE_RESULT.md +12 -7
- jobs/lesion_spectrum_job.py +129 -0
- paper/paper2_rank_objectives_draft.md +24 -16
- paper/paper3_midlayer_draft.md +11 -4
- paper/working_draft.md +13 -6
- research_v4/lesion_spectrum.json +24 -0
gate_reports/NEGATIVE_RESULT.md
CHANGED
|
@@ -9,15 +9,20 @@ them as the pruning objective is worse than simply ranking tokens by lesion-subs
|
|
| 9 |
## Mechanism (the transferable part)
|
| 10 |
|
| 11 |
A rank-based coverage functional `C(S) = effrank(P_L Z_S)` is maximized by a retained set that
|
| 12 |
-
**diversely spans** the lesion subspace's directions.
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
|
|
|
|
|
|
|
|
|
| 16 |
is what rare-pathology retention needs.
|
| 17 |
|
| 18 |
-
Formally: rank coverage rewards the *entropy of the retained singular spectrum*; lesion
|
| 19 |
-
rewards *mass on the top membership tokens*. These objectives diverge
|
| 20 |
-
is rare
|
|
|
|
|
|
|
| 21 |
|
| 22 |
## Three independent lines of evidence (same verdict)
|
| 23 |
|
|
|
|
| 9 |
## Mechanism (the transferable part)
|
| 10 |
|
| 11 |
A rank-based coverage functional `C(S) = effrank(P_L Z_S)` is maximized by a retained set that
|
| 12 |
+
**diversely spans** the lesion subspace's directions. The decisive property of a small lesion is
|
| 13 |
+
that it is **rare** — a **few** high-membership tokens out of ~196. A *set*-level rank/coverage
|
| 14 |
+
objective is insensitive to such a cluster: a handful of tokens cannot materially raise the retained
|
| 15 |
+
set's effective rank, so the objective spends budget on abundant background directions and drops the
|
| 16 |
+
lesion. This is a **rarity** mechanism, not low internal geometry — measured at the operating layer,
|
| 17 |
+
lesion tokens are *not* low-rank relative to background (pooled effective rank 339 vs 307;
|
| 18 |
+
participation ratio 18.9 vs 13.9; `research_v4/lesion_spectrum.json`). Concentration, not spanning,
|
| 19 |
is what rare-pathology retention needs.
|
| 20 |
|
| 21 |
+
Formally: rank coverage rewards the *entropy of the retained set's singular spectrum*; lesion
|
| 22 |
+
retention rewards *mass on the top membership tokens*. These objectives diverge whenever the
|
| 23 |
+
critical signal is a **rare** cluster — of any internal rank. (The synthetic closed-form law of the
|
| 24 |
+
companion paper isolates a second, distinct route — a genuinely low-rank signal, gap `(m-r)/m`; real
|
| 25 |
+
lesions fail via rarity, not low rank.)
|
| 26 |
|
| 27 |
## Three independent lines of evidence (same verdict)
|
| 28 |
|
jobs/lesion_spectrum_job.py
ADDED
|
@@ -0,0 +1,129 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# /// script
|
| 2 |
+
# requires-python = ">=3.10"
|
| 3 |
+
# dependencies = [
|
| 4 |
+
# "torch", "torchvision", "numpy", "pillow", "scipy",
|
| 5 |
+
# "huggingface_hub>=0.34", "dinov3 @ git+https://github.com/facebookresearch/dinov3",
|
| 6 |
+
# ]
|
| 7 |
+
# ///
|
| 8 |
+
"""Measure the REAL lesion-token spectrum to verify the paper-2 rank attribution. HF Job (GPU).
|
| 9 |
+
|
| 10 |
+
Paper #2 attributes the retention gap to lesion signal being LOW-RANK relative to the high-
|
| 11 |
+
dimensional background. That attribution was written because the law requires it -- here we CHECK it
|
| 12 |
+
against measured spectra. At the operating layer (block 3, MedDINOv3), using masks ANALYSIS-ONLY
|
| 13 |
+
(no subspace construction), we compute effective rank (RankMe) and participation ratio for:
|
| 14 |
+
- POOLED lesion tokens vs an equal-count random background pool (is the lesion SUBSPACE low-rank
|
| 15 |
+
relative to background?)
|
| 16 |
+
- WITHIN-IMAGE (lesions with m>=4 tokens): effrank(lesion)/m vs effrank(random m bg)/m (are
|
| 17 |
+
lesion tokens CONCENTRATED within a lesion, i.e. internal rank < m?)
|
| 18 |
+
Verdict CONFIRMS the attribution iff lesion effective rank is materially LOWER than background
|
| 19 |
+
(relative). If not, the attribution must change and the paper should say so. Emits LES_SPEC_RESULT.
|
| 20 |
+
"""
|
| 21 |
+
from __future__ import annotations
|
| 22 |
+
import json, os, sys, time
|
| 23 |
+
from pathlib import Path
|
| 24 |
+
import numpy as np, torch
|
| 25 |
+
from PIL import Image
|
| 26 |
+
from huggingface_hub import hf_hub_download
|
| 27 |
+
sys.path.insert(0,"/mnt/processed/covtoken_code")
|
| 28 |
+
from dinov3.models.vision_transformer import vit_base # noqa: E402
|
| 29 |
+
|
| 30 |
+
BACKBONE_REPO="ricklisz123/MedDINOv3-ViTB-16-CT-3M"; MNT=Path("/mnt")
|
| 31 |
+
MASK_ROOT=MNT/"processed"/"lidc_v2"; OUT=MNT/"processed"/"covtoken"
|
| 32 |
+
N_PATCH,CLS_OFF=196,5; LAYER=int(os.environ.get("LAYER","2")) # block 3
|
| 33 |
+
EVAL_SLICES=int(os.environ.get("EVAL_SLICES","1200"))
|
| 34 |
+
CT_MEAN=np.array([0.485,0.456,0.406],np.float32); CT_STD=np.array([0.229,0.224,0.225],np.float32)
|
| 35 |
+
def log(m): print(f"[lesspec] {m}", flush=True)
|
| 36 |
+
|
| 37 |
+
|
| 38 |
+
def load_backbone(device):
|
| 39 |
+
ck=hf_hub_download(BACKBONE_REPO,"model.pth",token=os.environ.get("HF_TOKEN"))
|
| 40 |
+
m=vit_base(drop_path_rate=0.0,layerscale_init=1e-5,n_storage_tokens=4,qkv_bias=False,mask_k_bias=True)
|
| 41 |
+
raw=torch.load(ck,map_location="cpu"); sd=raw.get("teacher",raw)
|
| 42 |
+
sd={(k[9:] if k.startswith("backbone.") else k):v for k,v in sd.items()}
|
| 43 |
+
m.load_state_dict(sd,strict=False); m.eval().to(device)
|
| 44 |
+
for p in m.parameters(): p.requires_grad_(False)
|
| 45 |
+
feats={}
|
| 46 |
+
def h(_m,_i,out):
|
| 47 |
+
while isinstance(out,(list,tuple)): out=out[0]
|
| 48 |
+
feats[0]=out.detach()
|
| 49 |
+
m.blocks[LAYER].register_forward_hook(h)
|
| 50 |
+
return m,feats
|
| 51 |
+
|
| 52 |
+
|
| 53 |
+
def to_t(p):
|
| 54 |
+
im=Image.open(p).convert("RGB").resize((224,224),Image.BILINEAR)
|
| 55 |
+
return torch.from_numpy(((np.asarray(im,np.float32)/255.0-CT_MEAN)/CT_STD)).permute(2,0,1)
|
| 56 |
+
|
| 57 |
+
|
| 58 |
+
@torch.inference_mode()
|
| 59 |
+
def tok(model,feats,imgs,device):
|
| 60 |
+
model.forward_features(imgs.to(device,torch.float32))
|
| 61 |
+
return feats[0][:,CLS_OFF:CLS_OFF+N_PATCH,:].float().cpu().numpy()
|
| 62 |
+
|
| 63 |
+
|
| 64 |
+
def svals(X):
|
| 65 |
+
Xc=X-X.mean(0,keepdims=True)
|
| 66 |
+
return np.linalg.svd(Xc,compute_uv=False)
|
| 67 |
+
|
| 68 |
+
|
| 69 |
+
def rankme(s):
|
| 70 |
+
p=s/(s.sum()+1e-12); return float(np.exp(-(p*np.log(p+1e-12)).sum()))
|
| 71 |
+
|
| 72 |
+
|
| 73 |
+
def part_ratio(s):
|
| 74 |
+
l=s**2; return float((l.sum()**2)/((l**2).sum()+1e-12))
|
| 75 |
+
|
| 76 |
+
|
| 77 |
+
def main():
|
| 78 |
+
t0=time.time(); device=torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
| 79 |
+
model,feats=load_backbone(device); rng=np.random.default_rng(0)
|
| 80 |
+
ev=[]
|
| 81 |
+
for cd in sorted((MASK_ROOT/"test").iterdir()):
|
| 82 |
+
npz=cd/"patch_masks.npz"
|
| 83 |
+
if cd.is_dir() and npz.exists():
|
| 84 |
+
pm=np.load(npz)["patch_masks"]
|
| 85 |
+
for idx in range(len(pm)):
|
| 86 |
+
if pm[idx].sum()>0: ev.append((cd/f"slice_{idx:04d}.png", pm[idx].reshape(-1)))
|
| 87 |
+
ev=[ev[i] for i in rng.choice(len(ev),min(EVAL_SLICES,len(ev)),replace=False)]
|
| 88 |
+
log(f"device={device.type}; layer block {LAYER+1}; lesion-positive eval slices={len(ev)}")
|
| 89 |
+
|
| 90 |
+
les_pool=[]; bg_pool=[]; within=[] # within: (effrank_les/m, effrank_bg/m) for m>=4
|
| 91 |
+
for i in range(0,len(ev),48):
|
| 92 |
+
ch=ev[i:i+48]; T=tok(model,feats,torch.stack([to_t(p) for p,_ in ch]),device)
|
| 93 |
+
for b,(_,m) in enumerate(ch):
|
| 94 |
+
li=np.where(m>0)[0]; bi=np.where(m==0)[0]
|
| 95 |
+
les_pool.append(T[b,li]);
|
| 96 |
+
bg_pool.append(T[b, rng.choice(bi, min(len(li),len(bi)), replace=False)]) # matched-count bg
|
| 97 |
+
if len(li)>=4 and len(bi)>=len(li):
|
| 98 |
+
rl=rankme(svals(T[b,li])); rb=rankme(svals(T[b, rng.choice(bi,len(li),replace=False)]))
|
| 99 |
+
within.append((rl/len(li), rb/len(li), rl, rb, len(li)))
|
| 100 |
+
L=np.concatenate(les_pool); B=np.concatenate(bg_pool)
|
| 101 |
+
n=min(len(L),len(B), 20000)
|
| 102 |
+
L=L[rng.choice(len(L),n,replace=False)]; B=B[rng.choice(len(B),n,replace=False)]
|
| 103 |
+
sL=svals(L); sB=svals(B)
|
| 104 |
+
res={"backbone":"MedDINOv3","layer_block":LAYER+1,"n_lesion_tokens_total":int(len(les_pool) and sum(len(x) for x in les_pool)),
|
| 105 |
+
"pooled":{"n_per_pool":int(n),"ambient_dim":768,
|
| 106 |
+
"lesion_rankme":round(rankme(sL),2),"background_rankme":round(rankme(sB),2),
|
| 107 |
+
"lesion_participation_ratio":round(part_ratio(sL),2),"background_participation_ratio":round(part_ratio(sB),2),
|
| 108 |
+
"lesion_top10_sv_frac":round(float(sL[:10].sum()/sL.sum()),3),"background_top10_sv_frac":round(float(sB[:10].sum()/sB.sum()),3),
|
| 109 |
+
"rankme_ratio_lesion_over_bg":round(rankme(sL)/max(rankme(sB),1e-9),3)}}
|
| 110 |
+
if within:
|
| 111 |
+
w=np.array(within);
|
| 112 |
+
res["within_image_m_ge_4"]={"n_images":len(within),"mean_effrank_lesion_over_m":round(float(w[:,0].mean()),3),
|
| 113 |
+
"mean_effrank_random_bg_over_m":round(float(w[:,1].mean()),3),"mean_m":round(float(w[:,4].mean()),2)}
|
| 114 |
+
pr=res["pooled"]
|
| 115 |
+
confirmed = pr["lesion_rankme"] < pr["background_rankme"]*0.85
|
| 116 |
+
res["verdict"]={"attribution":"low internal rank relative to background",
|
| 117 |
+
"CONFIRMED": bool(confirmed),
|
| 118 |
+
"statement": (f"Lesion-token pooled effective rank {pr['lesion_rankme']} vs background {pr['background_rankme']} "
|
| 119 |
+
f"(ratio {pr['rankme_ratio_lesion_over_bg']}); lesion top-10 SVs capture {pr['lesion_top10_sv_frac']} of variance "
|
| 120 |
+
f"vs background {pr['background_top10_sv_frac']}. "
|
| 121 |
+
+ ("ATTRIBUTION CONFIRMED: lesion subspace is materially lower-rank than background."
|
| 122 |
+
if confirmed else
|
| 123 |
+
"ATTRIBUTION NOT CONFIRMED at the chosen threshold: rewrite the paper-2 scope accordingly."))}
|
| 124 |
+
res["elapsed_s"]=round(time.time()-t0,1)
|
| 125 |
+
OUT.mkdir(parents=True,exist_ok=True); (OUT/"lesion_spectrum.json").write_text(json.dumps(res,indent=2))
|
| 126 |
+
print("LES_SPEC_RESULT "+json.dumps(res),flush=True)
|
| 127 |
+
|
| 128 |
+
|
| 129 |
+
if __name__=="__main__": main()
|
paper/paper2_rank_objectives_draft.md
CHANGED
|
@@ -11,15 +11,18 @@ venue_targets: [NeurIPS/ICML/TMLR, MIDL negative-results, ML4H]
|
|
| 11 |
Effective-rank / coding-rate objectives — RankMe, MCR2, coding rate, and the variance terms in
|
| 12 |
VICReg-style methods — are a popular proxy for representation "quality" and an increasingly common
|
| 13 |
regularizer in self-supervised learning, including medical SSL. We show, with a mechanism and a
|
| 14 |
-
closed-form law, that **these objectives are structurally mismatched to rare
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
objective retains a rank-r, m-token signal in proportion `min(r,m)/m`,
|
| 20 |
-
objective retains all of it,
|
| 21 |
-
|
| 22 |
-
|
|
|
|
|
|
|
|
|
|
| 23 |
isolates the failure to rank as a SELECTION objective: rank as a representation *scaling*
|
| 24 |
(whitening) leaves localizability nearly unchanged. The practical consequence: for rare-pathology
|
| 25 |
and rare-event tasks, prefer concentration (energy/membership) objectives over rank/spanning ones.
|
|
@@ -59,13 +62,18 @@ token/feature SELECTION** is mismatched to rare signal — not all rank pressure
|
|
| 59 |
|
| 60 |
## 4. Real-data validation and scope
|
| 61 |
|
| 62 |
-
On real medical images (small lesions, frozen SSL backbone), the
|
| 63 |
-
concentration retains 0.81 of small-lesion mass vs spanning 0.46 (gap ~0.35, CI excludes 0),
|
| 64 |
-
constrained coverage-floor pruner built on the rank functional retains 0.22 vs 0.82 for a
|
| 65 |
-
membership rule — it actively hurts.
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 69 |
|
| 70 |
## 5. Rigor
|
| 71 |
|
|
|
|
| 11 |
Effective-rank / coding-rate objectives — RankMe, MCR2, coding rate, and the variance terms in
|
| 12 |
VICReg-style methods — are a popular proxy for representation "quality" and an increasingly common
|
| 13 |
regularizer in self-supervised learning, including medical SSL. We show, with a mechanism and a
|
| 14 |
+
closed-form law, that **these objectives are structurally mismatched to retaining a rare critical
|
| 15 |
+
cluster under token/feature SELECTION** — the regime of small-lesion detection, anomaly detection,
|
| 16 |
+
and thin-structure retention. A rank/spanning objective optimizes the *retained set's* spectrum,
|
| 17 |
+
which is insensitive to a rare cluster by either of two routes: the cluster is internally low-rank,
|
| 18 |
+
*or* it is simply too few tokens to move set-level coverage. We prove (synthetic, closed form) the
|
| 19 |
+
low-rank route: a spanning objective retains a rank-r, m-token signal in proportion `min(r,m)/m`, a
|
| 20 |
+
concentration objective retains all of it, gap `(m-r)/m`, crossover `r*=m`. On real medical images
|
| 21 |
+
the failure is large (small lesions: concentration 0.81 vs spanning 0.46 retention) but — and we
|
| 22 |
+
**measured** this — driven by the **rarity** route, not the low-rank one: lesion tokens are *not*
|
| 23 |
+
low-rank relative to background (pooled effective rank 339 vs 307), yet a few of them cannot raise
|
| 24 |
+
the retained set's rank, so coverage drops them. We map the alignment functional `A(rank, SNR)`. A
|
| 25 |
+
controlled probe
|
| 26 |
isolates the failure to rank as a SELECTION objective: rank as a representation *scaling*
|
| 27 |
(whitening) leaves localizability nearly unchanged. The practical consequence: for rare-pathology
|
| 28 |
and rare-event tasks, prefer concentration (energy/membership) objectives over rank/spanning ones.
|
|
|
|
| 62 |
|
| 63 |
## 4. Real-data validation and scope
|
| 64 |
|
| 65 |
+
On real medical images (small lesions, frozen SSL backbone), the predicted failure is large and
|
| 66 |
+
robust: concentration retains 0.81 of small-lesion mass vs spanning 0.46 (gap ~0.35, CI excludes 0),
|
| 67 |
+
and a constrained coverage-floor pruner built on the rank functional retains 0.22 vs 0.82 for a
|
| 68 |
+
membership rule — it actively hurts. **But we measured the mechanism, and it is rarity, not low
|
| 69 |
+
rank.** At the operating layer, lesion tokens are *not* low-rank relative to background: pooled
|
| 70 |
+
effective rank (RankMe) **339 vs 307**, participation ratio 18.9 vs 13.9, within-image internal-
|
| 71 |
+
rank/m ≈ equal (0.737 vs 0.715). The exact `(m-r)/m` closed form is therefore a clean-background
|
| 72 |
+
*idealization* of one route to the failure; real lesions take the other — they are too **few** to
|
| 73 |
+
move a set-level coverage statistic (aggregate coverage is identical on lesion-positive vs -negative
|
| 74 |
+
slices, 250.4 vs 247.2), regardless of their internal rank. Both routes are the same principle: a
|
| 75 |
+
*set*-coverage/rank objective is insensitive to a rare critical cluster. We report the measured
|
| 76 |
+
spectra explicitly so the theory is not over-attributed. [`research_v4/lesion_spectrum.json`]
|
| 77 |
|
| 78 |
## 5. Rigor
|
| 79 |
|
paper/paper3_midlayer_draft.md
CHANGED
|
@@ -43,10 +43,17 @@ annotation.
|
|
| 43 |
## 3. The mechanism (why mid-layer)
|
| 44 |
|
| 45 |
We disentangle two candidate causes per layer: spatial information (position-probe accuracy) and
|
| 46 |
-
globalization (flip-invariance). **Localizability anti-correlates with
|
| 47 |
-
not with spatial information** — position is near-perfectly
|
| 48 |
-
the loss with depth is not positional.
|
| 49 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
|
| 51 |
## 4. Cross-objective: the mechanism is causal-by-comparison
|
| 52 |
|
|
|
|
| 43 |
## 3. The mechanism (why mid-layer)
|
| 44 |
|
| 45 |
We disentangle two candidate causes per layer: spatial information (position-probe accuracy) and
|
| 46 |
+
globalization (flip-invariance), measured on MedDINOv3. **Localizability anti-correlates with
|
| 47 |
+
flip-invariance (ρ=−0.94), not with spatial information** — patch position is near-perfectly
|
| 48 |
+
decodable at *every* block, so the loss with depth is not positional. This is architecturally
|
| 49 |
+
expected for this backbone: MedDINOv3/DINOv3 encode position with **axial RoPE applied to the
|
| 50 |
+
queries/keys of every attention block** (patch tokens only; CLS and register/storage tokens
|
| 51 |
+
excluded), with *no* learned absolute position embedding — so positional information is re-injected
|
| 52 |
+
at all depths by construction (verified against the DINOv3 source). DINOv2, our ultrasound backbone,
|
| 53 |
+
differs: it uses a **learned absolute position embedding added once at the input** (not RoPE); there
|
| 54 |
+
we confirm the invariance–localizability coupling empirically (§4, ρ=−0.93) but do not import the
|
| 55 |
+
RoPE-based positional control. As features become invariant to augmentation (the self-distillation
|
| 56 |
+
goal), they trade away the fine local discrimination small lesions need.
|
| 57 |
|
| 58 |
## 4. Cross-objective: the mechanism is causal-by-comparison
|
| 59 |
|
paper/working_draft.md
CHANGED
|
@@ -178,12 +178,19 @@ subspace-only (−0.60 / −0.52, CI excludes 0). The constraint does not add va
|
|
| 178 |
|
| 179 |
### 6.2 Mechanism (transferable)
|
| 180 |
`C(S)=effrank(P_L Z_S)` is maximized by a retained set that **diversely spans** the subspace's
|
| 181 |
-
directions.
|
| 182 |
-
|
| 183 |
-
|
| 184 |
-
|
| 185 |
-
|
| 186 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 187 |
|
| 188 |
### 6.3 Convergent evidence
|
| 189 |
Three independent lines reach the same verdict: (a) the ablation above; (b) principled Gate-2
|
|
|
|
| 178 |
|
| 179 |
### 6.2 Mechanism (transferable)
|
| 180 |
`C(S)=effrank(P_L Z_S)` is maximized by a retained set that **diversely spans** the subspace's
|
| 181 |
+
directions. The decisive property of a lesion is that it is **rare** — a handful of tokens out of
|
| 182 |
+
~196. A set-level rank/coverage objective is therefore *insensitive* to it: a few tokens cannot
|
| 183 |
+
materially raise the retained set's effective rank, so the objective spends the budget on abundant
|
| 184 |
+
background directions and drops the lesion. This is a **rarity** mechanism, not an internal-geometry
|
| 185 |
+
one — and we checked: measured at the operating layer, lesion tokens are *not* low-rank relative to
|
| 186 |
+
background (pooled effective rank 339 vs 307; participation ratio 18.9 vs 13.9; within-image
|
| 187 |
+
internal-rank/m ≈ equal). Lesion tokens are in fact diverse; the set-coverage objective is blind to
|
| 188 |
+
them anyway because they are few. (The synthetic law of the companion paper reaches the same failure
|
| 189 |
+
via a genuinely low-rank signal; real lesions reach it via rarity — two routes to one principle.)
|
| 190 |
+
Rank coverage rewards the entropy of the retained *set* spectrum; lesion retention rewards mass on
|
| 191 |
+
the top membership tokens; these diverge whenever the critical signal is a **rare** cluster, of any
|
| 192 |
+
internal rank. **For rare-pathology tasks, prefer concentration objectives (energy / membership
|
| 193 |
+
mass) over rank/spanning objectives (RankMe, coding rate, MCR2).**
|
| 194 |
|
| 195 |
### 6.3 Convergent evidence
|
| 196 |
Three independent lines reach the same verdict: (a) the ablation above; (b) principled Gate-2
|
research_v4/lesion_spectrum.json
ADDED
|
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"study": "Lesion-token spectrum — verifies (and REFUTES) the paper-2 'low internal rank' attribution",
|
| 3 |
+
"backbone": "MedDINOv3", "layer_block": 3, "dataset": "LIDC", "n_lesion_positive_slices": 1200,
|
| 4 |
+
"pooled": {
|
| 5 |
+
"n_per_pool": 2269, "ambient_dim": 768,
|
| 6 |
+
"lesion_rankme": 338.97, "background_rankme": 307.00, "rankme_ratio_lesion_over_bg": 1.104,
|
| 7 |
+
"lesion_participation_ratio": 18.91, "background_participation_ratio": 13.89,
|
| 8 |
+
"lesion_top10_sv_frac": 0.176, "background_top10_sv_frac": 0.203
|
| 9 |
+
},
|
| 10 |
+
"within_image_m_ge_4": {"n_images": 145, "mean_m": 4.3,
|
| 11 |
+
"mean_effrank_lesion_over_m": 0.737, "mean_effrank_random_bg_over_m": 0.715},
|
| 12 |
+
"verdict": {
|
| 13 |
+
"attribution_tested": "lesion signal is low internal rank relative to background",
|
| 14 |
+
"CONFIRMED": false,
|
| 15 |
+
"finding": "REFUTED. By every measure at the operating layer, lesion tokens are NOT low-rank relative to background: pooled effective rank 339 vs 307 (1.10x HIGHER), participation ratio 18.9 vs 13.9, top-10 SVs capture LESS variance (0.176 vs 0.203 = more spread), within-image internal-rank/m ~equal (0.737 vs 0.715). The 'few tokens in a similar direction / low diversity' framing is false on real data."
|
| 16 |
+
},
|
| 17 |
+
"corrected_mechanism": {
|
| 18 |
+
"real_failure_driver": "RARITY, not low internal rank",
|
| 19 |
+
"statement": "The coverage/rank objective fails on real lesions because they are FEW (a handful of tokens cannot materially raise the RETAINED SET's effective rank), so a set-coverage-maximizing selection is indifferent to them regardless of their internal diversity -- and they are in fact diverse. The synthetic S1 law produces the SAME failure via the low-rank knob (gap=(m-r)/m); real lesions reach it via the rarity knob. Both are consequences of optimizing the retained SET's spectrum, which is blind to a rare critical cluster.",
|
| 20 |
+
"convergent_confirmation": "Aggregate coverage is identical on lesion-positive vs -negative slices (250.4 vs 247.2) -- a 1-3 token cluster cannot move a set-level statistic over ~196 tokens. (paper #1 evidence (c))",
|
| 21 |
+
"impact": "S1 closed-form law unchanged (it is a valid synthetic existence proof for the low-rank route). The REAL-DATA scope in paper #2 (and the mechanism prose in paper #1) is corrected from 'lesions are low-rank relative to background' to 'lesions are rare; set-coverage is insensitive to a rare cluster of any internal rank'."
|
| 22 |
+
},
|
| 23 |
+
"human_signoff": null
|
| 24 |
+
}
|