dgrachev commited on
Commit
ed80304
·
verified ·
1 Parent(s): 801483b

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +62 -15
README.md CHANGED
@@ -1,30 +1,77 @@
1
- # A2 Pretrained Model
 
 
 
 
 
 
 
 
2
 
3
- Pretrained ViLGP3D model for 6-DOF grasp pose selection in tabletop manipulation.
4
 
5
- ## Model Architecture
6
 
7
- - **Network**: CLIPAction (CLIP-based action selection with cross-attention)
8
- - **Width**: 768
9
- - **Layers**: 1
10
- - **Heads**: 8
11
- - **Action Dim**: 7 (xyz + quaternion)
12
- - **Features**: RoPE (Rotary Position Encoding)
 
 
13
 
14
  ## Usage
15
 
 
 
16
  ```python
17
- from lerobot_policy_a2 import A2Policy, A2Config
18
 
19
  # Load pretrained model
20
  policy = A2Policy.from_pretrained("dgrachev/a2_pretrained")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  ```
22
 
23
- ## Training Data
24
 
25
- Trained on simulated tabletop grasping with UR5e robot and Robotiq gripper.
 
 
 
 
26
 
27
- ## Related
28
 
29
- - Environment: Install with `pip install lerobot[a2]`
30
- - Assets: [dgrachev/a2_assets](https://huggingface.co/datasets/dgrachev/a2_assets)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - robotics
5
+ - manipulation
6
+ - grasp
7
+ - lerobot
8
+ - clip
9
+ ---
10
 
11
+ # A2 Pretrained Policy
12
 
13
+ Pretrained ViLGP3D policy for 6-DOF grasp and place tasks in tabletop manipulation.
14
 
15
+ ## Model Description
16
+
17
+ This model uses CLIP-based cross-attention for selecting grasp and place poses from candidates generated by GraspNet/PlaceNet.
18
+
19
+ ## Files
20
+
21
+ - `sl_checkpoint_199.pth`: Trained policy weights (ViLGP3D fusion network)
22
+ - `checkpoint-rs.tar`: GraspNet checkpoint for grasp candidate generation
23
 
24
  ## Usage
25
 
26
+ ### With lerobot_policy_a2
27
+
28
  ```python
29
+ from lerobot_policy_a2 import A2Policy
30
 
31
  # Load pretrained model
32
  policy = A2Policy.from_pretrained("dgrachev/a2_pretrained")
33
+
34
+ # Use for grasp prediction
35
+ action, info = policy.predict_grasp(
36
+ color_images={"front": rgb_image},
37
+ depth_images={"front": depth_image},
38
+ point_cloud=point_cloud,
39
+ lang_goal="grasp a round object"
40
+ )
41
+ ```
42
+
43
+ ### With LeRobot A2 Environment
44
+
45
+ ```bash
46
+ # Data collection
47
+ A2_DISABLE_EGL=true uv run python -m lerobot.envs.a2_collect --policy a2 --hf_repo dgrachev/a2_pretrained --task grasp --num_episodes 100
48
+
49
+ # Benchmark evaluation
50
+ A2_DISABLE_EGL=true uv run python -m lerobot.envs.a2_benchmark --task grasp --policy a2 --hf_repo dgrachev/a2_pretrained
51
  ```
52
 
53
+ ## Training Details
54
 
55
+ - **Architecture**: ViLGP3D with CLIP ViT-B/32 backbone
56
+ - **Hidden dim**: 768
57
+ - **Attention heads**: 8
58
+ - **Position encoding**: Rotary Position Encoding (RoPE)
59
+ - **Training data**: Tabletop manipulation demonstrations
60
 
61
+ ## Related Resources
62
 
63
+ - [lerobot_policy_a2](https://github.com/dgrachev/lerobot_policy_a2) - Policy package
64
+ - [lerobot_grach0v](https://github.com/grach0v/lerobot) - LeRobot fork with A2 environment
65
+ - [a2_assets](https://huggingface.co/datasets/dgrachev/a2_assets) - Environment assets
66
+
67
+ ## Citation
68
+
69
+ ```bibtex
70
+ @misc{a2_policy,
71
+ author = {Denis Grachev},
72
+ title = {A2 Policy: CLIP-based 6-DOF Grasp and Place Policy},
73
+ year = {2025},
74
+ publisher = {HuggingFace},
75
+ url = {https://huggingface.co/dgrachev/a2_pretrained}
76
+ }
77
+ ```