businesslion
/

scaffdiff

3d-scene-completion

scaffold-dominant

Model card Files Files and versions

businesslion commited on Apr 6

Commit

d17c4a8

·

verified ·

1 Parent(s): 96ad48f

Upload folder using huggingface_hub

Files changed (2) hide show

README.md +7 -0
vae_v3/best_point_vae.pth +3 -0

README.md CHANGED Viewed

@@ -32,6 +32,12 @@ Checkpoints for cross-modal diffusion-based 3D scene completion on SemanticKITTI
 | Teacher v1GT | `teacher_v1gt/best_model.pth` | 0.608 +/- 0.141 | LiDAR | Same architecture, trained on v1 GT |
 | Student v1GT | `student_v1gt/best_model.pth` | 0.721 +/- 0.167 | RGB (DA2 pseudo-depth) | Task-loss-only distillation from v1 teacher |
 ## Architecture
 - **Encoder**: Frozen Sonata/PTv3 (108M params, pretrained)
@@ -54,6 +60,7 @@ SemanticKITTI outdoor driving scenes. v2 GT uses anchor-based ICP refinement (50
 |-------|--------|-----|-----------|-----|
 | Teacher v2GT | 30 | 1e-4 | 2 | RTX 4090 24GB |
 | Student v2GT | 15 | 1e-4 | 2 | RTX 4090 24GB |
 ## Citation

 | Teacher v1GT | `teacher_v1gt/best_model.pth` | 0.608 +/- 0.141 | LiDAR | Same architecture, trained on v1 GT |
 | Student v1GT | `student_v1gt/best_model.pth` | 0.721 +/- 0.167 | RGB (DA2 pseudo-depth) | Task-loss-only distillation from v1 teacher |
+### VAE (reconstruction, not scene completion)
+| Model | Path | CD | Description |
+|-------|------|-----|-------------|
+| VAE v3 | `vae_v3/best_point_vae.pth` | 0.120 +/- 0.026 | VecSet-style cross-attention, 32 tokens, 7.1M params |
 ## Architecture
 - **Encoder**: Frozen Sonata/PTv3 (108M params, pretrained)
 |-------|--------|-----|-----------|-----|
 | Teacher v2GT | 30 | 1e-4 | 2 | RTX 4090 24GB |
 | Student v2GT | 15 | 1e-4 | 2 | RTX 4090 24GB |
+| VAE v3 | 100 | 3e-4 | 4 | RTX 4090 24GB |
 ## Citation

vae_v3/best_point_vae.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:07a078f296c8ad3f49ac1dba5ba56631a6478538fdefb904dd222ed7408a75a7
+size 85442585