ilessio-aiflowlab commited on
Commit
7a6f75c
·
verified ·
1 Parent(s): e332826

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -37,3 +37,6 @@ onnx/ran_v1.onnx.data filter=lfs diff=lfs merge=lfs -text
37
  paper.pdf filter=lfs diff=lfs merge=lfs -text
38
  tensorrt/ran_v1_fp16.trt filter=lfs diff=lfs merge=lfs -text
39
  tensorrt/ran_v1_fp32.trt filter=lfs diff=lfs merge=lfs -text
 
 
 
 
37
  paper.pdf filter=lfs diff=lfs merge=lfs -text
38
  tensorrt/ran_v1_fp16.trt filter=lfs diff=lfs merge=lfs -text
39
  tensorrt/ran_v1_fp32.trt filter=lfs diff=lfs merge=lfs -text
40
+ onnx/ran_v2.onnx.data filter=lfs diff=lfs merge=lfs -text
41
+ tensorrt/ran_v2_fp16.trt filter=lfs diff=lfs merge=lfs -text
42
+ tensorrt/ran_v2_fp32.trt filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,106 +1,52 @@
1
  ---
2
  tags:
3
- - robotics
4
- - anima
5
- - ran
6
- - openurban3d
7
- - point-cloud
8
  - 3d-segmentation
 
9
  - open-vocabulary
10
- - clip
11
- - robot-flow-labs
12
- library_name: pytorch
13
- pipeline_tag: image-segmentation
14
  license: apache-2.0
 
 
 
15
  ---
16
 
17
- # RAN (OpenUrban3D) — ANIMA Module
18
-
19
- Part of the [ANIMA Perception Suite](https://github.com/RobotFlow-Labs) by Robot Flow Labs.
20
 
21
- **Open-vocabulary 3D semantic segmentation for large-scale urban point clouds** without manual annotations, aligned multi-view images, or pre-trained segmentation networks.
22
-
23
- ## Paper
24
-
25
- **OpenUrban3D: Annotation-Free Open-Vocabulary Semantic Segmentation of Large-Scale Urban Point Clouds**
26
- Chongyu Wang, Kunlei Jing, Jihua Zhu, Di Wang
27
- [arXiv:2509.10842](https://arxiv.org/abs/2509.10842) (Sep 2025)
28
 
29
  ## Architecture
30
-
31
- RAN implements a knowledge distillation pipeline:
32
- 1. **Multi-view rendering** Render 3D point clouds from 8 hemispherical camera viewpoints
33
- 2. **SLIC mask generation** Unsupervised superpixel segmentation on rendered views
34
- 3. **CLIP ViT-L/14 feature extraction** — Extract 768-dim vision-language features per mask
35
- 4. **Sample-balanced fusion** — Aggregate mask-level features to per-point embeddings
36
- 5. **MinkUNet distillation** — Train a 3D backbone to predict CLIP features from raw point coordinates
37
- 6. **Zero-shot segmentation** — At inference, compare point features with text queries via cosine similarity
38
-
39
- ### Model Details
40
-
41
- | Parameter | Value |
42
- |-----------|-------|
43
- | 3D Backbone | MinkUNet (dense fallback) |
44
- | Feature dim | 768 (CLIP ViT-L/14 aligned) |
45
- | Parameters | 0.97M |
46
- | VL Teacher | CLIP ViT-L/14 (frozen) |
47
- | Voxel size | 0.2m |
48
-
49
- ## Training
50
-
51
- | Setting | Value |
52
- |---------|-------|
53
- | Dataset | SensatUrban (24 blocks, 29.9M points) |
54
- | Optimizer | Adam |
55
- | Learning rate | 1e-4 (cosine annealing + warmup) |
56
- | Batch size | 4 |
57
- | Epochs | 43/60 (early stopped, patience=10) |
58
- | Best val_loss | 13.04 |
59
- | Final train_loss | 8.03 |
60
- | Precision | bf16 mixed |
61
- | Hardware | NVIDIA L4 (22GB) |
62
- | Training time | 61 min |
63
-
64
- ## Exported Formats
65
-
66
- | Format | File | Size | Use Case |
67
- |--------|------|------|----------|
68
- | PyTorch (.pth) | `pytorch/ran_v1.pth` | 3.9 MB | Training, fine-tuning |
69
- | SafeTensors | `pytorch/ran_v1.safetensors` | 3.9 MB | Fast loading, safe |
70
- | ONNX | `onnx/ran_v1.onnx` | 3.9 MB | Cross-platform inference |
71
- | Checkpoint | `checkpoints/best.pth` | 11 MB | Resume training (includes optimizer) |
72
-
73
- TensorRT exports deferred to target hardware (Jetson/L4).
74
 
75
  ## Usage
76
-
77
  ```python
78
- import torch
79
- from safetensors.torch import load_file
80
 
81
- # Load model
82
- weights = load_file("pytorch/ran_v1.safetensors")
83
- # ... build model and load weights
84
 
85
- # Zero-shot segmentation
86
- point_features = model(point_cloud) # (N, 768)
87
- text_features = clip.encode_text(["building", "tree", "road"]) # (C, 768)
88
- similarity = point_features @ text_features.T # (N, C)
89
- labels = similarity.argmax(dim=-1) # (N,)
90
  ```
91
 
92
- ## Files
93
-
 
 
 
 
 
 
 
 
 
 
 
 
94
  ```
95
- pytorch/ran_v1.pth PyTorch weights
96
- pytorch/ran_v1.safetensors SafeTensors weights
97
- onnx/ran_v1.onnx ONNX export (opset 17)
98
- checkpoints/best.pth Full checkpoint (model + optimizer + scheduler)
99
- configs/training.yaml Training configuration
100
- logs/training_history.json Loss curves
101
- paper.pdf OpenUrban3D paper (arXiv:2509.10842)
102
- ```
103
-
104
- ## License
105
 
106
- Apache 2.0 — Robot Flow Labs / AIFLOW LABS LIMITED
 
 
1
  ---
2
  tags:
 
 
 
 
 
3
  - 3d-segmentation
4
+ - point-cloud
5
  - open-vocabulary
6
+ - zero-shot
7
+ - urban-scene
8
+ - anima
 
9
  license: apache-2.0
10
+ datasets:
11
+ - sensat-urban
12
+ - sum
13
  ---
14
 
15
+ # OpenUrban3D (RAN) — Annotation-Free Open-Vocabulary 3D Segmentation
 
 
16
 
17
+ **Paper**: [OpenUrban3D](https://arxiv.org/abs/2509.10842) (Wang et al., Sep 2025)
 
 
 
 
 
 
18
 
19
  ## Architecture
20
+ - **3D Backbone**: MinkUNet (sparse 3D convolutions)
21
+ - **2D Feature Extractor**: ODISE (frozen)
22
+ - **Text Encoder**: CLIP ViT-L/14 (frozen)
23
+ - **Training**: Knowledge distillation (VL features 3D backbone)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
  ## Usage
 
26
  ```python
27
+ from anima_ran.inference.zero_shot import ZeroShotSegmenter
 
28
 
29
+ segmenter = ZeroShotSegmenter(backbone_checkpoint="pytorch/ran_v1.pth")
30
+ segmenter.load()
 
31
 
32
+ result = segmenter.segment(points, ["building", "vegetation", "road"])
 
 
 
 
33
  ```
34
 
35
+ ## Training Config
36
+ - Optimizer: Adam, LR=1e-4
37
+ - Epochs: 60, Batch size: 2
38
+ - Voxel size: 0.2m
39
+ - Hardware: 2x NVIDIA A6000
40
+
41
+ ## Citation
42
+ ```bibtex
43
+ @article{wang2025openurban3d,
44
+ title={OpenUrban3D: Annotation-Free Open-Vocabulary Semantic Segmentation of Large-Scale Urban Point Clouds},
45
+ author={Wang, Chongyu and Jing, Kunlei and Zhu, Jihua and Wang, Di},
46
+ journal={arXiv preprint arXiv:2509.10842},
47
+ year={2025}
48
+ }
49
  ```
 
 
 
 
 
 
 
 
 
 
50
 
51
+ ## ANIMA Project
52
+ Part of the [ANIMA Wave-6](https://github.com/RobotFlow-Labs) multi-agent robotics perception system.
checkpoints/best.pth CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:81c8fce25975c13683489b0ceafb02b10bda6b6ddaa8bae4e3d0504e0e156ca8
3
  size 11645755
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8d0214e837998edeb77846615c3263223e3b4cdb3f12617a6c77bcfe265123ac
3
  size 11645755
checkpoints/best_v2.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8d0214e837998edeb77846615c3263223e3b4cdb3f12617a6c77bcfe265123ac
3
+ size 11645755
configs/training_v2.yaml ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # OpenUrban3D V2 Training Config — Optimized for L4 GPU (80% VRAM target)
2
+ #
3
+ # Changes from V1:
4
+ # - batch_size: 4 -> 6 (1.5x throughput)
5
+ # - num_points: 100000 -> 95000 (profiled with real CombinedLoss)
6
+ # - num_workers: 2 (reduced to avoid mmap contention)
7
+ # - pin_memory: false (user instruction)
8
+ # - Peak VRAM: ~18.6 GB / 23 GB (80.9% utilization)
9
+
10
+ dataset:
11
+ name: sensat-urban
12
+ root: /mnt/forge-data/datasets/sensat_urban/
13
+ train_scenes: [melbourne, sydney]
14
+ val_scenes: [london]
15
+ num_points: 95000 # V2: profiled with real loss (80.9% VRAM on L4)
16
+
17
+ model:
18
+ backbone: minkunet
19
+ feature_dim: 768
20
+ pretrained: null
21
+ projection_hidden: 256
22
+ vl_dim: 768
23
+
24
+ distillation:
25
+ kl_weight: 1.0
26
+ cosine_weight: 0.5
27
+ temperature: 0.07
28
+
29
+ training:
30
+ max_epochs: 60
31
+ warmup_fraction: 0.05
32
+ lr: 1.0e-4
33
+ optimizer: adam
34
+ batch_size: 6 # V2: 1.5x from V1 (80.9% VRAM on L4 with real loss)
35
+ num_workers: 2 # V2: reduced from 4 (mmap-safe)
36
+ pin_memory: false # V2: disabled per instruction
37
+ gradient_clip: 1.0
38
+ seed: 42
39
+ use_bf16: true
40
+
41
+ early_stop_patience: 10
42
+ early_stop_min_delta: 1.0e-4
43
+ plateau_patience: 5
44
+ plateau_factor: 0.5
45
+
46
+ checkpoint_dir: /mnt/artifacts-datai/checkpoints/project_ran/
47
+ log_dir: /mnt/artifacts-datai/logs/project_ran/
48
+ tensorboard_dir: /mnt/artifacts-datai/tensorboard/project_ran/
49
+
50
+ version: 2 # V2 checkpoint naming
51
+
52
+ device: cuda
logs/train_v2_20260403_0121.log ADDED
@@ -0,0 +1,146 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MinkowskiEngine not installed. MinkUNet will use dense fallback. Install: pip install MinkowskiEngine (requires CUDA toolkit).
2
+ [2026-04-03 01:21:14] INFO ran.train: ============================================================
3
+ [2026-04-03 01:21:14] INFO ran.train: OpenUrban3D (RAN) — Training
4
+ [2026-04-03 01:21:14] INFO ran.train: Paper: arXiv:2509.10842
5
+ [2026-04-03 01:21:14] INFO ran.train: ============================================================
6
+ [2026-04-03 01:21:14] INFO ran.train: Version: v2
7
+ [2026-04-03 01:21:14] INFO ran.train: Backbone: minkunet
8
+ [2026-04-03 01:21:14] INFO ran.train: VL dim: 768
9
+ [2026-04-03 01:21:14] INFO ran.train: Num points: 95000
10
+ [2026-04-03 01:21:14] INFO ran.train: Batch size: 6 (per GPU)
11
+ [2026-04-03 01:21:14] INFO ran.train: LR: 0.000100
12
+ [2026-04-03 01:21:14] INFO ran.train: Epochs: 60
13
+ [2026-04-03 01:21:14] INFO ran.train: Optimizer: Adam (paper)
14
+ [2026-04-03 01:21:14] INFO ran.train: Device: cuda
15
+ [2026-04-03 01:21:14] INFO ran.train: Distributed: False (world_size=1)
16
+ [2026-04-03 01:21:14] INFO ran.train: GPU 0: NVIDIA L4 (22.0 GB)
17
+ [2026-04-03 01:21:14] INFO ran.train: ============================================================
18
+ [2026-04-03 01:21:14] INFO anima_ran.backbone.distilled_backbone: Built MinkUNet backbone (feature_dim=768)
19
+ [2026-04-03 01:21:17] INFO ran.train: Real dataset found: 24 train + 6 test blocks at /mnt/forge-data/datasets/sensat_urban/
20
+ [2026-04-03 01:21:17] INFO ran.train: Loading mmap cache: /mnt/forge-data/datasets/sensat_urban/.cache
21
+ [2026-04-03 01:21:17] INFO ran.train: Mmap loaded: 29900364 points, 768-dim features (shared across ranks)
22
+ [2026-04-03 01:21:17] WARNING ran.train: Using MOCK VL features. Run ODISE extraction before real training.
23
+ [2026-04-03 01:21:18] INFO anima_ran.training.train: Starting training: 60 epochs, lr=0.000001, device=cuda, bf16=True
24
+ [2026-04-03 01:23:52] INFO anima_ran.training.train: [Epoch 1/60] train_loss=8054.2809 val_loss=5526.7839 lr=0.000001
25
+ [2026-04-03 01:23:53] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch000_val5526.7839.pth
26
+ [2026-04-03 01:26:19] INFO anima_ran.training.train: [Epoch 2/60] train_loss=4276.4000 val_loss=1309.8103 lr=0.000034
27
+ [2026-04-03 01:26:19] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch001_val1309.8103.pth
28
+ [2026-04-03 01:28:43] INFO anima_ran.training.train: [Epoch 3/60] train_loss=620.8463 val_loss=211.5017 lr=0.000067
29
+ [2026-04-03 01:28:43] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch002_val211.5017.pth
30
+ [2026-04-03 01:28:43] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch000_val5526.7839.pth
31
+ [2026-04-03 01:31:11] INFO anima_ran.training.train: [Epoch 4/60] train_loss=106.2237 val_loss=69.6856 lr=0.000100
32
+ [2026-04-03 01:31:11] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch003_val69.6856.pth
33
+ [2026-04-03 01:31:11] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch001_val1309.8103.pth
34
+ [2026-04-03 01:33:38] INFO anima_ran.training.train: [Epoch 5/60] train_loss=59.6002 val_loss=51.4477 lr=0.000100
35
+ [2026-04-03 01:33:38] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch004_val51.4477.pth
36
+ [2026-04-03 01:33:38] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch002_val211.5017.pth
37
+ [2026-04-03 01:36:04] INFO anima_ran.training.train: [Epoch 6/60] train_loss=32.9955 val_loss=28.3968 lr=0.000100
38
+ [2026-04-03 01:36:04] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch005_val28.3968.pth
39
+ [2026-04-03 01:36:04] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch003_val69.6856.pth
40
+ [2026-04-03 01:38:29] INFO anima_ran.training.train: [Epoch 7/60] train_loss=24.6750 val_loss=23.6595 lr=0.000099
41
+ [2026-04-03 01:38:29] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch006_val23.6595.pth
42
+ [2026-04-03 01:38:29] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch004_val51.4477.pth
43
+ [2026-04-03 01:40:54] INFO anima_ran.training.train: [Epoch 8/60] train_loss=20.2998 val_loss=25.5374 lr=0.000099
44
+ [2026-04-03 01:40:54] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch007_val25.5374.pth
45
+ [2026-04-03 01:40:54] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch005_val28.3968.pth
46
+ [2026-04-03 01:43:25] INFO anima_ran.training.train: [Epoch 9/60] train_loss=17.7125 val_loss=24.1704 lr=0.000098
47
+ [2026-04-03 01:43:25] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch008_val24.1704.pth
48
+ [2026-04-03 01:43:25] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch007_val25.5374.pth
49
+ [2026-04-03 01:45:53] INFO anima_ran.training.train: [Epoch 10/60] train_loss=15.8250 val_loss=17.3653 lr=0.000097
50
+ [2026-04-03 01:45:53] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch009_val17.3653.pth
51
+ [2026-04-03 01:45:53] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch008_val24.1704.pth
52
+ [2026-04-03 01:48:23] INFO anima_ran.training.train: [Epoch 11/60] train_loss=14.5038 val_loss=16.3036 lr=0.000096
53
+ [2026-04-03 01:48:23] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch010_val16.3036.pth
54
+ [2026-04-03 01:48:23] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch006_val23.6595.pth
55
+ [2026-04-03 01:50:48] INFO anima_ran.training.train: [Epoch 12/60] train_loss=13.5077 val_loss=19.3260 lr=0.000095
56
+ [2026-04-03 01:50:48] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch011_val19.3260.pth
57
+ [2026-04-03 01:50:48] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch011_val19.3260.pth
58
+ [2026-04-03 01:53:14] INFO anima_ran.training.train: [Epoch 13/60] train_loss=12.7002 val_loss=19.5751 lr=0.000094
59
+ [2026-04-03 01:53:14] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch012_val19.5751.pth
60
+ [2026-04-03 01:53:14] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch012_val19.5751.pth
61
+ [2026-04-03 01:55:39] INFO anima_ran.training.train: [Epoch 14/60] train_loss=12.1038 val_loss=15.1362 lr=0.000093
62
+ [2026-04-03 01:55:39] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch013_val15.1362.pth
63
+ [2026-04-03 01:55:39] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch009_val17.3653.pth
64
+ [2026-04-03 01:58:04] INFO anima_ran.training.train: [Epoch 15/60] train_loss=12.7494 val_loss=22.0125 lr=0.000091
65
+ [2026-04-03 01:58:04] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch014_val22.0125.pth
66
+ [2026-04-03 01:58:04] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch014_val22.0125.pth
67
+ [2026-04-03 02:00:30] INFO anima_ran.training.train: [Epoch 16/60] train_loss=13.8269 val_loss=17.6172 lr=0.000090
68
+ [2026-04-03 02:00:30] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch015_val17.6172.pth
69
+ [2026-04-03 02:00:30] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch015_val17.6172.pth
70
+ [2026-04-03 02:02:56] INFO anima_ran.training.train: [Epoch 17/60] train_loss=12.5961 val_loss=15.7333 lr=0.000088
71
+ [2026-04-03 02:02:56] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch016_val15.7333.pth
72
+ [2026-04-03 02:02:56] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch010_val16.3036.pth
73
+ [2026-04-03 02:05:27] INFO anima_ran.training.train: [Epoch 18/60] train_loss=12.1039 val_loss=20.2886 lr=0.000086
74
+ [2026-04-03 02:05:27] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch017_val20.2886.pth
75
+ [2026-04-03 02:05:27] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch017_val20.2886.pth
76
+ [2026-04-03 02:07:59] INFO anima_ran.training.train: [Epoch 19/60] train_loss=11.3476 val_loss=13.3878 lr=0.000084
77
+ [2026-04-03 02:07:59] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch018_val13.3878.pth
78
+ [2026-04-03 02:07:59] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch016_val15.7333.pth
79
+ [2026-04-03 02:10:32] INFO anima_ran.training.train: [Epoch 20/60] train_loss=10.2384 val_loss=16.9979 lr=0.000082
80
+ [2026-04-03 02:10:32] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch019_val16.9979.pth
81
+ [2026-04-03 02:10:32] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch019_val16.9979.pth
82
+ [2026-04-03 02:13:04] INFO anima_ran.training.train: [Epoch 21/60] train_loss=9.9192 val_loss=16.7852 lr=0.000080
83
+ [2026-04-03 02:13:04] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch020_val16.7852.pth
84
+ [2026-04-03 02:13:04] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch020_val16.7852.pth
85
+ [2026-04-03 02:15:36] INFO anima_ran.training.train: [Epoch 22/60] train_loss=9.6877 val_loss=13.0286 lr=0.000078
86
+ [2026-04-03 02:15:36] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch021_val13.0286.pth
87
+ [2026-04-03 02:15:36] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch013_val15.1362.pth
88
+ [2026-04-03 02:18:08] INFO anima_ran.training.train: [Epoch 23/60] train_loss=9.5108 val_loss=12.8504 lr=0.000075
89
+ [2026-04-03 02:18:08] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch022_val12.8504.pth
90
+ [2026-04-03 02:18:08] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch018_val13.3878.pth
91
+ [2026-04-03 02:20:41] INFO anima_ran.training.train: [Epoch 24/60] train_loss=9.4225 val_loss=16.0130 lr=0.000073
92
+ [2026-04-03 02:20:41] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch023_val16.0130.pth
93
+ [2026-04-03 02:20:41] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch023_val16.0130.pth
94
+ [2026-04-03 02:23:21] INFO anima_ran.training.train: [Epoch 25/60] train_loss=9.2871 val_loss=15.3460 lr=0.000070
95
+ [2026-04-03 02:23:21] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch024_val15.3460.pth
96
+ [2026-04-03 02:23:21] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch024_val15.3460.pth
97
+ [2026-04-03 02:25:57] INFO anima_ran.training.train: [Epoch 26/60] train_loss=9.1418 val_loss=12.7876 lr=0.000068
98
+ [2026-04-03 02:25:58] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch025_val12.7876.pth
99
+ [2026-04-03 02:25:58] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch021_val13.0286.pth
100
+ [2026-04-03 02:28:34] INFO anima_ran.training.train: [Epoch 27/60] train_loss=8.9928 val_loss=12.7671 lr=0.000065
101
+ [2026-04-03 02:28:34] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch026_val12.7671.pth
102
+ [2026-04-03 02:28:34] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch022_val12.8504.pth
103
+ [2026-04-03 02:31:05] INFO anima_ran.training.train: [Epoch 28/60] train_loss=8.8741 val_loss=14.4635 lr=0.000063
104
+ [2026-04-03 02:31:05] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch027_val14.4635.pth
105
+ [2026-04-03 02:31:05] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch027_val14.4635.pth
106
+ [2026-04-03 02:33:35] INFO anima_ran.training.train: [Epoch 29/60] train_loss=8.7588 val_loss=15.1041 lr=0.000060
107
+ [2026-04-03 02:33:35] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch028_val15.1041.pth
108
+ [2026-04-03 02:33:35] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch028_val15.1041.pth
109
+ [2026-04-03 02:36:05] INFO anima_ran.training.train: [Epoch 30/60] train_loss=8.6943 val_loss=12.5515 lr=0.000057
110
+ [2026-04-03 02:36:05] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch029_val12.5515.pth
111
+ [2026-04-03 02:36:05] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch025_val12.7876.pth
112
+ [2026-04-03 02:38:38] INFO anima_ran.training.train: [Epoch 31/60] train_loss=8.6135 val_loss=13.5761 lr=0.000055
113
+ [2026-04-03 02:38:38] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch030_val13.5761.pth
114
+ [2026-04-03 02:38:38] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch030_val13.5761.pth
115
+ [2026-04-03 02:41:10] INFO anima_ran.training.train: [Epoch 32/60] train_loss=8.4979 val_loss=13.6059 lr=0.000052
116
+ [2026-04-03 02:41:10] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch031_val13.6059.pth
117
+ [2026-04-03 02:41:10] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch031_val13.6059.pth
118
+ [2026-04-03 02:43:42] INFO anima_ran.training.train: [Epoch 33/60] train_loss=9.5209 val_loss=15.6166 lr=0.000049
119
+ [2026-04-03 02:43:42] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch032_val15.6166.pth
120
+ [2026-04-03 02:43:42] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch032_val15.6166.pth
121
+ [2026-04-03 02:46:13] INFO anima_ran.training.train: [Epoch 34/60] train_loss=9.0347 val_loss=15.3416 lr=0.000046
122
+ [2026-04-03 02:46:13] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch033_val15.3416.pth
123
+ [2026-04-03 02:46:13] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch033_val15.3416.pth
124
+ [2026-04-03 02:48:37] INFO anima_ran.training.train: [Epoch 35/60] train_loss=8.8009 val_loss=14.2802 lr=0.000044
125
+ [2026-04-03 02:48:37] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch034_val14.2802.pth
126
+ [2026-04-03 02:48:37] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch034_val14.2802.pth
127
+ [2026-04-03 02:51:04] INFO anima_ran.training.train: [Epoch 36/60] train_loss=8.4578 val_loss=13.6164 lr=0.000041
128
+ [2026-04-03 02:51:04] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch035_val13.6164.pth
129
+ [2026-04-03 02:51:04] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch035_val13.6164.pth
130
+ [2026-04-03 02:53:32] INFO anima_ran.training.train: [Epoch 37/60] train_loss=8.2129 val_loss=13.2788 lr=0.000019
131
+ [2026-04-03 02:53:32] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch036_val13.2788.pth
132
+ [2026-04-03 02:53:32] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch036_val13.2788.pth
133
+ [2026-04-03 02:55:59] INFO anima_ran.training.train: [Epoch 38/60] train_loss=8.0754 val_loss=13.4448 lr=0.000018
134
+ [2026-04-03 02:55:59] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch037_val13.4448.pth
135
+ [2026-04-03 02:55:59] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch037_val13.4448.pth
136
+ [2026-04-03 02:58:22] INFO anima_ran.training.train: [Epoch 39/60] train_loss=8.0478 val_loss=13.0786 lr=0.000017
137
+ [2026-04-03 02:58:22] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch038_val13.0786.pth
138
+ [2026-04-03 02:58:22] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch038_val13.0786.pth
139
+ [2026-04-03 03:00:49] INFO anima_ran.training.train: [Epoch 40/60] train_loss=8.0181 val_loss=12.9826 lr=0.000015
140
+ [2026-04-03 03:00:49] INFO anima_ran.training.train: Saved checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch039_val12.9826.pth
141
+ [2026-04-03 03:00:49] INFO anima_ran.training.train: Deleted old checkpoint: /mnt/artifacts-datai/checkpoints/project_ran/project_ran_cuda_v2_epoch039_val12.9826.pth
142
+ [2026-04-03 03:00:49] INFO anima_ran.training.train: [EARLY STOP] epoch 40, best val_loss=12.5515 at patience=10
143
+ [2026-04-03 03:00:49] INFO anima_ran.training.train: Saved training history: /mnt/artifacts-datai/logs/project_ran/training_history.json
144
+ [2026-04-03 03:00:49] INFO ran.train: Training complete in 99.5 minutes
145
+ [2026-04-03 03:00:49] INFO ran.train: Final train_loss: 8.0181
146
+ [2026-04-03 03:00:49] INFO ran.train: Final val_loss: 12.9826
logs/training_history.json CHANGED
@@ -1,93 +1,87 @@
1
  {
2
  "train_loss": [
3
- 6759.289434176772,
4
- 2297.732088914558,
5
- 203.5367457831084,
6
- 58.58010551111022,
7
- 31.958893590898658,
8
- 23.83927464841017,
9
- 18.78112209377004,
10
- 16.055160024272862,
11
- 14.373852729797363,
12
- 13.272072649713772,
13
- 12.286517015144007,
14
- 11.6127598036581,
15
- 14.472964044827133,
16
- 11.232773851992478,
17
- 10.735281374917102,
18
- 10.398801376570516,
19
- 11.63162669850819,
20
- 11.858689521675679,
21
- 9.884707023848348,
22
- 10.34442413387014,
23
- 9.822304654477248,
24
- 9.251953125,
25
- 9.074569132790637,
26
- 9.016819014478086,
27
- 8.908954492255823,
28
- 8.774045004773496,
29
- 9.959359083602678,
30
- 8.718674019201478,
31
- 8.617429391661688,
32
- 8.944636501483064,
33
- 8.801453462287562,
34
- 8.753478875800745,
35
- 8.771042994598844,
36
- 8.832364993308907,
37
- 8.521471308238471,
38
- 8.312144251012091,
39
- 8.966958615317273,
40
- 8.718514058127331,
41
- 8.46768801959593,
42
- 8.124264048106635,
43
- 8.073782081034645,
44
- 8.04591079968125,
45
- 8.032603505832046
46
  ],
47
  "val_loss": [
48
- 5596.94873046875,
49
- 534.4771347045898,
50
- 94.13680362701416,
51
- 47.7150354385376,
52
- 31.689432382583618,
53
- 24.11519694328308,
54
- 24.731905698776245,
55
- 22.63585090637207,
56
- 17.82232165336609,
57
- 18.18808937072754,
58
- 15.790228843688965,
59
- 18.132004499435425,
60
- 17.121837615966797,
61
- 14.551922082901001,
62
- 14.441855549812317,
63
- 19.299299716949463,
64
- 15.931878924369812,
65
- 13.824809551239014,
66
- 14.941338181495667,
67
- 16.136274099349976,
68
- 16.182728052139282,
69
- 13.80290949344635,
70
- 13.546823859214783,
71
- 14.907316327095032,
72
- 14.543841481208801,
73
- 14.061812162399292,
74
- 13.720631957054138,
75
- 13.278507113456726,
76
- 13.976499438285828,
77
- 15.821560502052307,
78
- 14.872854590415955,
79
- 13.361186623573303,
80
- 13.039013624191284,
81
- 13.552350878715515,
82
- 14.962332725524902,
83
- 13.270270943641663,
84
- 13.803866624832153,
85
- 13.932194352149963,
86
- 14.697306752204895,
87
- 13.543349146842957,
88
- 13.248373866081238,
89
- 13.47856891155243,
90
- 13.577248096466064
91
  ],
92
  "lr": [
93
  1.0000000000000002e-06,
@@ -126,12 +120,9 @@
126
  4.641232239911956e-05,
127
  4.3700998887988794e-05,
128
  4.101032371977495e-05,
129
- 3.834846838653045e-05,
130
- 3.572351685541612e-05,
131
- 3.3143441017958675e-05,
132
- 1.530803823983901e-05,
133
- 1.4116198681937466e-05,
134
- 1.295715261874775e-05,
135
- 1.1834420035069828e-05
136
  ]
137
  }
 
1
  {
2
  "train_loss": [
3
+ 8054.280927942154,
4
+ 4276.39997766373,
5
+ 620.8462735439869,
6
+ 106.22373913704081,
7
+ 59.60022532686274,
8
+ 32.99547848802932,
9
+ 24.675044526445106,
10
+ 20.299824085641415,
11
+ 17.712546612354036,
12
+ 15.825045687087039,
13
+ 14.50377963451629,
14
+ 13.507708427753855,
15
+ 12.700157977165059,
16
+ 12.10383729731783,
17
+ 12.749443865836934,
18
+ 13.826934652125582,
19
+ 12.59610947142256,
20
+ 12.10389169733575,
21
+ 11.347582918532352,
22
+ 10.23838562661029,
23
+ 9.919178211942633,
24
+ 9.687722490188923,
25
+ 9.510761646514243,
26
+ 9.422516193795712,
27
+ 9.287059438989518,
28
+ 9.141846352435174,
29
+ 8.99275037075611,
30
+ 8.874052149184207,
31
+ 8.758765200351148,
32
+ 8.694337601357319,
33
+ 8.613513378386802,
34
+ 8.497914131651534,
35
+ 9.520941612568308,
36
+ 9.034681218735715,
37
+ 8.800915291968812,
38
+ 8.457775846440741,
39
+ 8.21288691175745,
40
+ 8.075425716156655,
41
+ 8.04781052406798,
42
+ 8.018138570988432
 
 
 
43
  ],
44
  "val_loss": [
45
+ 5526.783854166667,
46
+ 1309.810323079427,
47
+ 211.5017293294271,
48
+ 69.68559646606445,
49
+ 51.447741190592446,
50
+ 28.39682896931966,
51
+ 23.65952777862549,
52
+ 25.53741518656413,
53
+ 24.170440673828125,
54
+ 17.36533260345459,
55
+ 16.303635279337566,
56
+ 19.32600466410319,
57
+ 19.575107256571453,
58
+ 15.136160055796305,
59
+ 22.01252810160319,
60
+ 17.61719799041748,
61
+ 15.733330567677816,
62
+ 20.288647333780926,
63
+ 13.387786865234375,
64
+ 16.997878710428875,
65
+ 16.785153071085613,
66
+ 13.028555075327555,
67
+ 12.850395520528158,
68
+ 16.01299540201823,
69
+ 15.346006552378336,
70
+ 12.787633577982584,
71
+ 12.76709270477295,
72
+ 14.463542620340982,
73
+ 15.104134241739908,
74
+ 12.55145819981893,
75
+ 13.576054255167643,
76
+ 13.605932076772055,
77
+ 15.616601149241129,
78
+ 15.34158992767334,
79
+ 14.280176798502604,
80
+ 13.616401354471842,
81
+ 13.278789520263672,
82
+ 13.444838682810465,
83
+ 13.078571478525797,
84
+ 12.982638994852701
 
 
 
85
  ],
86
  "lr": [
87
  1.0000000000000002e-06,
 
120
  4.641232239911956e-05,
121
  4.3700998887988794e-05,
122
  4.101032371977495e-05,
123
+ 1.9174234193265224e-05,
124
+ 1.789689978215328e-05,
125
+ 1.6641402447669613e-05,
126
+ 1.5411555093955053e-05
 
 
 
127
  ]
128
  }
onnx/ran_v2.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3484850f9d9402bae948abd1dd59d1108cedfd215b98dd53981287d04dad34fb
3
+ size 6527
onnx/ran_v2.onnx.data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5a1bcf2a538b14d382e3df67d0b443b6e2c872900b72374433a41125b8d24b36
3
+ size 3866624
pytorch/ran_v2.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:87b1e13b2a8461961a94030b9cb7961300c4aeb747806bcc86e262e7c646de2c
3
+ size 3886501
pytorch/ran_v2.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1e567c7abefcfe755e9334a2b4647367b6bdeaece46271ebc42d66270af32e5b
3
+ size 3877880
tensorrt/ran_v2_fp16.trt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a0eedead665b525c24595dcb3506dce7bad68330095f94fd14236602231a1032
3
+ size 2054964
tensorrt/ran_v2_fp32.trt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:530cd9efaf8fa7fd723b87b4f30ad57331745d71c9f524f8b717c95f33d719cc
3
+ size 3959340