businesslion commited on
Commit
d17c4a8
·
verified ·
1 Parent(s): 96ad48f

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +7 -0
  2. vae_v3/best_point_vae.pth +3 -0
README.md CHANGED
@@ -32,6 +32,12 @@ Checkpoints for cross-modal diffusion-based 3D scene completion on SemanticKITTI
32
  | Teacher v1GT | `teacher_v1gt/best_model.pth` | 0.608 +/- 0.141 | LiDAR | Same architecture, trained on v1 GT |
33
  | Student v1GT | `student_v1gt/best_model.pth` | 0.721 +/- 0.167 | RGB (DA2 pseudo-depth) | Task-loss-only distillation from v1 teacher |
34
 
 
 
 
 
 
 
35
  ## Architecture
36
 
37
  - **Encoder**: Frozen Sonata/PTv3 (108M params, pretrained)
@@ -54,6 +60,7 @@ SemanticKITTI outdoor driving scenes. v2 GT uses anchor-based ICP refinement (50
54
  |-------|--------|-----|-----------|-----|
55
  | Teacher v2GT | 30 | 1e-4 | 2 | RTX 4090 24GB |
56
  | Student v2GT | 15 | 1e-4 | 2 | RTX 4090 24GB |
 
57
 
58
  ## Citation
59
 
 
32
  | Teacher v1GT | `teacher_v1gt/best_model.pth` | 0.608 +/- 0.141 | LiDAR | Same architecture, trained on v1 GT |
33
  | Student v1GT | `student_v1gt/best_model.pth` | 0.721 +/- 0.167 | RGB (DA2 pseudo-depth) | Task-loss-only distillation from v1 teacher |
34
 
35
+ ### VAE (reconstruction, not scene completion)
36
+
37
+ | Model | Path | CD | Description |
38
+ |-------|------|-----|-------------|
39
+ | VAE v3 | `vae_v3/best_point_vae.pth` | 0.120 +/- 0.026 | VecSet-style cross-attention, 32 tokens, 7.1M params |
40
+
41
  ## Architecture
42
 
43
  - **Encoder**: Frozen Sonata/PTv3 (108M params, pretrained)
 
60
  |-------|--------|-----|-----------|-----|
61
  | Teacher v2GT | 30 | 1e-4 | 2 | RTX 4090 24GB |
62
  | Student v2GT | 15 | 1e-4 | 2 | RTX 4090 24GB |
63
+ | VAE v3 | 100 | 3e-4 | 4 | RTX 4090 24GB |
64
 
65
  ## Citation
66
 
vae_v3/best_point_vae.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:07a078f296c8ad3f49ac1dba5ba56631a6478538fdefb904dd222ed7408a75a7
3
+ size 85442585