Avakn commited on
Commit
8612e2c
·
verified ·
1 Parent(s): 2dd2413

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +101 -0
README.md ADDED
@@ -0,0 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # RoboTwin2 Checkpoints
2
+
3
+ Trained on [RoboTwin2.0](https://github.com/TianxingChen/RoboTwin) using the [TianxingChen/RoboTwin2.0](https://huggingface.co/datasets/TianxingChen/RoboTwin2.0) dataset.
4
+
5
+ ## Tasks
6
+ - `place_phone_stand`
7
+ - `place_a2b_left`
8
+ - `move_can_pot`
9
+ - `handover_block`
10
+ - `put_bottles_dustbin`
11
+
12
+ ## Data
13
+ - **Demonstrations:** 50 `demo_clean` episodes per task
14
+ - **Embodiment:** aloha-agilex (dual-arm)
15
+ - **Action dim:** 14 (6 DOF × 2 arms + 2 grippers)
16
+ - **Cameras:** `cam_high`, `cam_right_wrist`, `cam_left_wrist`
17
+
18
+ ---
19
+
20
+ ## ACT (Action Chunking Transformers)
21
+
22
+ ### Architecture
23
+ | Param | Value |
24
+ |---|---|
25
+ | Backbone | ResNet-18 |
26
+ | Hidden dim | 512 |
27
+ | Feedforward dim | 3200 |
28
+ | Attention heads | 8 |
29
+ | Encoder layers | 4 |
30
+ | Decoder layers | 7 |
31
+ | Chunk size | 50 |
32
+ | KL weight | 10 |
33
+ | Action dim | 14 |
34
+ | Dropout | 0.1 |
35
+ | Parameters | ~83.9M |
36
+
37
+ ### Training
38
+ | Param | Value |
39
+ |---|---|
40
+ | Batch size | 8 |
41
+ | Epochs | 6000 |
42
+ | Learning rate | 1e-5 |
43
+ | LR backbone | 1e-5 |
44
+ | Weight decay | 1e-4 |
45
+ | Optimizer | AdamW |
46
+ | Save freq | every 2000 epochs |
47
+
48
+ ### Checkpoints
49
+ | Path | Seed | Val Loss |
50
+ |---|---|---|
51
+ | `ACT/act-place_phone_stand/demo_clean-50/` | 0 | — |
52
+ | `ACT/act-place_phone_stand-run2/demo_clean-50/` | 1 | 0.038 |
53
+ | `ACT/act-place_a2b_left/demo_clean-50/` | 0 | — |
54
+ | `ACT/act-place_a2b_left-run2/demo_clean-50/` | 1 | 0.059 |
55
+ | `ACT/act-move_can_pot/demo_clean-50/` | 0 | — |
56
+ | `ACT/act-move_can_pot-run2/demo_clean-50/` | 1 | 0.036 |
57
+ | `ACT/act-handover_block-run2/demo_clean-50/` | 1 | 0.030 |
58
+ | `ACT/act-put_bottles_dustbin-run2/demo_clean-50/` | 1 | 0.032 |
59
+
60
+ Each checkpoint directory contains:
61
+ - `policy_best.ckpt` — best validation loss checkpoint
62
+ - `policy_last.ckpt` — final epoch checkpoint
63
+ - `policy_epoch_{2000,4000,5000,6000}_seed_{0,1}.ckpt` — intermediate checkpoints
64
+ - `dataset_stats.pkl` — normalization statistics
65
+
66
+ ---
67
+
68
+ ## Pi0.5 LoRA (place_phone_stand only)
69
+
70
+ Fine-tuned from `gs://openpi-assets/checkpoints/pi05_base/params` using the [openpi](https://github.com/Physical-Intelligence/openpi) framework.
71
+
72
+ ### Architecture
73
+ | Param | Value |
74
+ |---|---|
75
+ | Base model | Pi0.5 (3B params) |
76
+ | PaliGemma variant | `gemma_2b_lora` |
77
+ | Action expert variant | `gemma_300m_lora` |
78
+ | Fine-tuning method | LoRA |
79
+
80
+ ### Training
81
+ | Param | Value |
82
+ |---|---|
83
+ | Batch size | 32 |
84
+ | Total steps | 20,000 (trained to 9,000) |
85
+ | Save interval | 200 steps |
86
+ | XLA memory fraction | 0.45 (64 GB pool on H200) |
87
+ | GPU | NVIDIA H200 (143 GB VRAM) |
88
+
89
+ ### Checkpoints
90
+ | Path | Step |
91
+ |---|---|
92
+ | `pi05_lora/place_phone_stand/step_5000/` | 5,000 |
93
+ | `pi05_lora/place_phone_stand/step_9000/` | 9,000 |
94
+
95
+ ---
96
+
97
+ ## Environment
98
+ - **Framework:** [RoboTwin2.0](https://github.com/TianxingChen/RoboTwin)
99
+ - **Simulator:** SAPIEN with Vulkan rendering
100
+ - **GPU:** NVIDIA H200 SXM (143 GB VRAM)
101
+ - **CUDA:** 12.8