---
datasets:
- nvidia/PhysicalAI-Robotics-GraspGen
language:
- en
---

Project Website: https://graspgen.github.io/ <br>
Code: https://github.com/NVlabs/GraspGen/

Abstract: Grasping is a fundamental robot skill, yet despite significant research advancements, learning-based 6-DOF grasping approaches are still not turnkey and struggle to generalize across different embodiments and in-the-wild settings. We build upon the recent success on modeling the object-centric grasp generation process as an iterative diffusion process. Our proposed framework - GraspGen - consists of a Diffusion-Transformer architecture that enhances grasp generation, paired with an efficient discriminator to score and filter sampled grasps. We introduce a novel and performant on-generator training recipe for the discriminator. To scale GraspGen to both objects and grippers, we release a new simulated dataset consisting of over 53 million grasps. We demonstrate that GraspGen outperforms prior methods in simulations with singulated objects across different grippers, achieves state-of-the-art performance on the FetchBench grasping benchmark, and performs well on a real robot with noisy visual observations.

## Model Architecture: <br> 
**Architecture Type:** Diffusion Model, Point Cloud network. See paper for more details. <br>

## Input: <br>
**Input Type(s):** Object partial point cloud X, Number of grasps to sample (B) <br>
**Input Format(s):** Point Cloud (N X 3) where N is the number of points <br>
**Input Parameters:** 3D <br>
**Other Properties Related to Input:** Point cloud needs to be in the form (N X xyz) where N=2048 is the number of points.<br>

## Output: <br>
**Output Type(s):** Grasp Poses; Corresponding confidence scores <br>
**Output Format:** Homogenous Transformation matrices; score is a scalar value from 0 to 1 <br>
**Output Parameters:** [B, 4, 4] where B is the number of generated grasp poses; [B, 1] confidence score <br>
**Other Properties Related to Output:** <br>