Upload folder using huggingface_hub
Browse files- README.md +7 -0
- vae_v3/best_point_vae.pth +3 -0
README.md
CHANGED
|
@@ -32,6 +32,12 @@ Checkpoints for cross-modal diffusion-based 3D scene completion on SemanticKITTI
|
|
| 32 |
| Teacher v1GT | `teacher_v1gt/best_model.pth` | 0.608 +/- 0.141 | LiDAR | Same architecture, trained on v1 GT |
|
| 33 |
| Student v1GT | `student_v1gt/best_model.pth` | 0.721 +/- 0.167 | RGB (DA2 pseudo-depth) | Task-loss-only distillation from v1 teacher |
|
| 34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
## Architecture
|
| 36 |
|
| 37 |
- **Encoder**: Frozen Sonata/PTv3 (108M params, pretrained)
|
|
@@ -54,6 +60,7 @@ SemanticKITTI outdoor driving scenes. v2 GT uses anchor-based ICP refinement (50
|
|
| 54 |
|-------|--------|-----|-----------|-----|
|
| 55 |
| Teacher v2GT | 30 | 1e-4 | 2 | RTX 4090 24GB |
|
| 56 |
| Student v2GT | 15 | 1e-4 | 2 | RTX 4090 24GB |
|
|
|
|
| 57 |
|
| 58 |
## Citation
|
| 59 |
|
|
|
|
| 32 |
| Teacher v1GT | `teacher_v1gt/best_model.pth` | 0.608 +/- 0.141 | LiDAR | Same architecture, trained on v1 GT |
|
| 33 |
| Student v1GT | `student_v1gt/best_model.pth` | 0.721 +/- 0.167 | RGB (DA2 pseudo-depth) | Task-loss-only distillation from v1 teacher |
|
| 34 |
|
| 35 |
+
### VAE (reconstruction, not scene completion)
|
| 36 |
+
|
| 37 |
+
| Model | Path | CD | Description |
|
| 38 |
+
|-------|------|-----|-------------|
|
| 39 |
+
| VAE v3 | `vae_v3/best_point_vae.pth` | 0.120 +/- 0.026 | VecSet-style cross-attention, 32 tokens, 7.1M params |
|
| 40 |
+
|
| 41 |
## Architecture
|
| 42 |
|
| 43 |
- **Encoder**: Frozen Sonata/PTv3 (108M params, pretrained)
|
|
|
|
| 60 |
|-------|--------|-----|-----------|-----|
|
| 61 |
| Teacher v2GT | 30 | 1e-4 | 2 | RTX 4090 24GB |
|
| 62 |
| Student v2GT | 15 | 1e-4 | 2 | RTX 4090 24GB |
|
| 63 |
+
| VAE v3 | 100 | 3e-4 | 4 | RTX 4090 24GB |
|
| 64 |
|
| 65 |
## Citation
|
| 66 |
|
vae_v3/best_point_vae.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:07a078f296c8ad3f49ac1dba5ba56631a6478538fdefb904dd222ed7408a75a7
|
| 3 |
+
size 85442585
|