|
|
--- |
|
|
license: mit |
|
|
--- |
|
|
# Geometric Reasoning Networks (GRN) |
|
|
|
|
|
 |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Model name:** Geometric Reasoning Networks (GRN) |
|
|
- **Model type:** Graph Neural Network for robot manipulation feasibility prediction |
|
|
- **Framework:** PyTorch, PyTorch Geometric |
|
|
- **Associated paper:** |
|
|
*Learning Geometric Reasoning Networks for Robot Task and Motion Planning*, ICLR 2025 |
|
|
- **Authors:** Smail Ait Bouhsain, Rachid Alami, Thierry Siméon |
|
|
- **License:** See repository LICENSE |
|
|
- **Paper:** https://openreview.net/pdf?id=ajxAJ8GUX4 |
|
|
- **Repository:** https://github.com/smail8/geometric_reasoning_networks |
|
|
|
|
|
GRN is a learned geometric reasoning model designed to augment **robot Task and Motion Planning (TAMP)** by predicting the feasibility of manipulation actions in cluttered 3D environments. |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Description |
|
|
|
|
|
The Geometric Reasoning Network (GRN) operates on **graph-structured representations of manipulation scenes**, encoding objects, candidate grasps, actions, and their geometric relations. It predicts action and grasp feasibility by reasoning jointly over learned geometric constraints. |
|
|
|
|
|
GRN integrates predictions from auxiliary learned modules: |
|
|
- **Inverse Kinematics (IK) feasibility** |
|
|
- **Grasp Obstruction (GO)** |
|
|
- **Action and Grasp Feasibility (AGF)** |
|
|
|
|
|
--- |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
### Primary Use Cases |
|
|
- Augmenting robot Task and Motion Planning (TAMP) |
|
|
- Learned feasibility prediction for manipulation actions |
|
|
- Research in robotic manipulation and geometric reasoning |
|
|
- Benchmarking learned planning heuristics |
|
|
|
|
|
### Out-of-Scope Use |
|
|
- Low-level control or trajectory optimization |
|
|
- Safety-critical deployment without validation |
|
|
- Non-robotic domains |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Architecture |
|
|
|
|
|
- Graph Attention Network with message passing |
|
|
- Nodes represent objects |
|
|
- Edges encode geometric and relational constraints |
|
|
- Supervised training on large-scale simulated datasets |
|
|
|
|
|
--- |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Training data:** GRN simulated manipulation datasets |
|
|
- **Loss:** Binary cross-entropy + Mean-squared error |
|
|
- **Optimizer:** Adam |
|
|
- **Hyperparameters:** |
|
|
| Module | Batch size | Learning Rate | Epochs | |
|
|
| --------- | ---------- | ------------- | ------ | |
|
|
| IK | 8192 | 1e-3 | 100 | |
|
|
| GO | 8192 | 1e-3 | 100 | |
|
|
| AGF | 2048 | 1e-4 | 100 | |
|
|
| GRN | 2048 | 1e-4 | 100 | |
|
|
|
|
|
- **Hardware:** NVIDIA RTX A5000 GPU |
|
|
- **Training Time:** 15 hours |
|
|
|
|
|
--- |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
Evaluation is performed on both in-distribution and out-of-distribution datasets with increasing scene complexity. |
|
|
|
|
|
Metrics include: |
|
|
- Feasibility prediction accuracy |
|
|
- Downstream task and motion planning success rate |
|
|
- Generalization to unseen clutter levels and object sizes |
|
|
|
|
|
GRN outperforms MLP, CNN-based, and standard GNN baselines. Please refer to the paper for detailed results. |
|
|
|
|
|
 |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Trained exclusively in simulation |
|
|
- Performance degrades for extreme clutter or unseen geometries |
|
|
- Assumes accurate scene geometry and object poses |
|
|
|
|
|
--- |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
This model does not involve human data. Users are responsible for ensuring safe deployment when integrated into physical robotic systems. |
|
|
|
|
|
--- |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@inproceedings{ait2025learning, |
|
|
title={Learning Geometric Reasoning Networks for Robot Task and Motion Planning}, |
|
|
author={Ait Bouhsain, Smail and Alami, Rachid and Simeon, Thierry}, |
|
|
booktitle={The Thirteenth International Conference on Learning Representations}, |
|
|
year={2025} |
|
|
} |
|
|
|