File size: 1,673 Bytes
ed80304 0055e9a ed80304 0055e9a ed80304 0055e9a ed80304 0055e9a ed80304 0055e9a ed80304 0055e9a ed80304 0055e9a ed80304 0055e9a ed80304 0055e9a ed80304 0055e9a ed80304 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
---
license: apache-2.0
tags:
- robotics
- manipulation
- grasp
- lerobot
- clip
---
# A2 Pretrained Policy
Pretrained ViLGP3D policy for 6-DOF grasp and place tasks in tabletop manipulation.
## Model Description
This model uses CLIP-based cross-attention for selecting grasp and place poses from candidates generated by GraspNet/PlaceNet.
## Files
- `sl_checkpoint_199.pth`: Trained policy weights (ViLGP3D fusion network)
- `checkpoint-rs.tar`: GraspNet checkpoint for grasp candidate generation
## Usage
### With lerobot_policy_a2
```python
from lerobot_policy_a2 import A2Policy
# Load pretrained model
policy = A2Policy.from_pretrained("dgrachev/a2_pretrained")
# Use for grasp prediction
action, info = policy.predict_grasp(
color_images={"front": rgb_image},
depth_images={"front": depth_image},
point_cloud=point_cloud,
lang_goal="grasp a round object"
)
```
## Training Details
- **Architecture**: ViLGP3D with CLIP ViT-B/32 backbone
- **Hidden dim**: 768
- **Attention heads**: 8
- **Position encoding**: Rotary Position Encoding (RoPE)
- **Training data**: Tabletop manipulation demonstrations
## Related Resources
- [lerobot_policy_a2](https://github.com/dgrachev/lerobot_policy_a2) - Policy package
- [lerobot_grach0v](https://github.com/grach0v/lerobot) - LeRobot fork with A2 environment
- [a2_assets](https://huggingface.co/datasets/dgrachev/a2_assets) - Environment assets
## Citation
```bibtex
@misc{a2_policy,
author = {Denis Grachev},
title = {A2 Policy: CLIP-based 6-DOF Grasp and Place Policy},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/dgrachev/a2_pretrained}
}
```
|