OutlineFlow β outline β typed room polygons
A from-scratch rectified flow-matching set-Transformer that, given only an apartment/floor outline, generates a complete set of typed interior room polygons (vector, not pixels). Challenge: Mirror Mirror on the Wall (DAVIS Γ Paris 2026), dataset: Modified Swiss Dwellings.
Final results (full plan_id model, align g16 + churn 0.3, 3-seed mean)
| Config | FID β | Density β | Coverage β |
|---|---|---|---|
| Raw baseline | 167 | 0.097 | 0.067 |
| Final | 135.3 | 0.088 | 0.111 |
FID β19%, Coverage +66% over baseline.
Usage
from generate import generate # from the code repo
rooms = generate(outline) # outline: shapely (Multi)Polygon or WKT
# rooms == [(shapely Polygon, room_type_id), ...] (vector, gap-free, seed 42)
Load weights: outputs_full_plan_id/ckpt.pt (the default checkpoint generate() reads).
Method (short)
- Pure rectified flow matching; DiT-style set-Transformer (AdaLN-Zero), outline encoded by a permutation-invariant PointNet, no positional encoding (a plan is a set).
- Diagnosis: the L2 objective regresses to the conditional mean β variance collapse β weak Density/Coverage.
- Two levers that work: grid-align post-process (FID) + churn / SDE sampling (Coverage).
- Diffusion (DDPM) and EBM variants were tested and did not beat this; richer conditioning (cross-attention + scale) raised outline-reading r 0.06β0.67 but did not move the metric, confirming the bottleneck is the loss, not the conditioning.
Code & full write-up: see the GitHub repo (training/eval/generation + result.md).
d_model=128, L=4, 1.33M params, 15k steps, full data (5,372 plans / 203k rooms), seed 42.