mingxinz commited on
Commit
94d7416
·
1 Parent(s): 7ab4492

Add model card and license

Browse files

Signed-off-by: Mingxin Zheng <mingxinz@nvidia.com>

Files changed (5) hide show
  1. EXPLAINABILITY.md +13 -0
  2. LICENSE +61 -0
  3. PRIVACY.md +11 -0
  4. README.md +132 -0
  5. SAFETY_and_SECURITY.md +6 -0
EXPLAINABILITY.md ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Field | Response
2
+ :------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------
3
+ Intended Domain: | Rheo (Isaac for Healthcare simulation workflow focused on preparing for surgical instruments handling tasks in an OR environment).
4
+ Model Type: | Robot Vision Language Action (VLA) model
5
+ Intended Users: | Isaac For Healthcare users testing operating room environments in simulation.
6
+ Output: | Action tensor (next 16 actions) for pick-and-place of a sterilized tray from shelf to cart.
7
+ Describe how the model works: | Accepts vision frames, language instruction, and robot observation; encodes multimodal inputs and outputs continuous robot action vectors.
8
+ Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of: | Not Applicable
9
+ Technical Limitations & Mitigation: | This model has been trained and tested with simulation data from the Isaac for Healthcare Rheo workflow and is not expected to generalize outside this environment.
10
+ Verified to have met prescribed NVIDIA quality standards: | Yes
11
+ Performance Metrics: | Latency, Task success rate
12
+ Potential Known Risks: | This model may make unexpected movements if deployed in environments outside the Rheo simulation. The model may not fully adhere to protocols or manage unforeseen scenarios when operating outside the simulated environment.
13
+ Licensing: | Nvidia License
LICENSE ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ NVIDIA License
2
+ 1. Definitions
3
+ “Licensor” means any person or entity that distributes its Work.
4
+ “Work” means (a) the original work of authorship made available under this license,
5
+ which may include software, documentation, or other files, and (b) any additions to or
6
+ derivative works thereof that are made available under this license.
7
+ The terms “reproduce,” “reproduction,” “derivative works,” and “distribution” have the
8
+ meaning as provided under U.S. copyright law; provided, however, that for the purposes
9
+ of this license, derivative works shall not include works that remain separable from, or
10
+ merely link (or bind by name) to the interfaces of, the Work.
11
+ Works are “made available” under this license by including in or with the Work either (a)
12
+ a copyright notice referencing the applicability of this license to the Work, or (b) a copy
13
+ of this license.
14
+ 2. License Grant
15
+ 2.1 Copyright Grant. Subject to the terms and conditions of this license, each
16
+ Licensor grants to you a perpetual, worldwide, non-exclusive, royalty-free,
17
+ copyright license to use, reproduce, prepare derivative works of, publicly display,
18
+ publicly perform, sublicense and distribute its Work and any resulting derivative
19
+ works in any form.
20
+ 3. Limitations
21
+ 3.1 Redistribution. You may reproduce or distribute the Work only if (a) you do so
22
+ under this license, (b) you include a complete copy of this license with your
23
+ distribution, and (c) you retain without modification any copyright, patent,
24
+ trademark, or attribution notices that are present in the Work.
25
+ 3.2 Derivative Works. You may specify that additional or different terms apply to
26
+ the use, reproduction, and distribution of your derivative works of the Work (“Your
27
+ Terms”) only if (a) Your Terms provide that the use limitation in Section 3.3
28
+ applies to your derivative works, and (b) you identify the specific derivative works
29
+ that are subject to Your Terms. Notwithstanding Your Terms, this license (including
30
+ the redistribution requirements in Section 3.1) will continue to apply to the Work
31
+ itself.
32
+ 3.3 Use Limitation. The Work and any derivative works thereof only may be used
33
+ or intended for use non-commercially. Notwithstanding the foregoing, NVIDIA
34
+ Corporation and its affiliates may use the Work and any derivative works
35
+ commercially. As used herein, “non-commercially” means for research or
36
+ evaluation purposes only.
37
+ 3.4 Patent Claims. If you bring or threaten to bring a patent claim against any
38
+ Licensor (including any claim, cross-claim or counterclaim in a lawsuit) to enforce
39
+ any patents that you allege are infringed by any Work, then your rights under this
40
+ license from such Licensor (including the grant in Section 2.1) will terminate
41
+ immediately.
42
+ 3.5 Trademarks. This license does not grant any rights to use any Licensor’s or its
43
+ affiliates’ names, logos, or trademarks, except as necessary to reproduce the
44
+ notices described in this license.
45
+ 3.6 Termination. If you violate any term of this license, then your rights under this
46
+ license (including the grant in Section 2.1) will terminate immediately.
47
+ 4. Disclaimer of Warranty.
48
+ THE WORK IS PROVIDED “AS IS” WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,
49
+ EITHER EXPRESS OR IMPLIED, INCLUDING WARRANTIES OR CONDITIONS OF
50
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE OR NON-
51
+ INFRINGEMENT. YOU BEAR THE RISK OF UNDERTAKING ANY ACTIVITIES UNDER THIS
52
+ LICENSE.
53
+ 5. Limitation of Liability.
54
+ EXCEPT AS PROHIBITED BY APPLICABLE LAW, IN NO EVENT AND UNDER NO LEGAL
55
+ THEORY, WHETHER IN TORT (INCLUDING NEGLIGENCE), CONTRACT, OR OTHERWISE
56
+ SHALL ANY LICENSOR BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY DIRECT,
57
+ INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF OR
58
+ RELATED TO THIS LICENSE, THE USE OR INABILITY TO USE THE WORK (INCLUDING BUT
59
+ NOT LIMITED TO LOSS OF GOODWILL, BUSINESS INTERRUPTION, LOST PROFITS OR
60
+ DATA, COMPUTER FAILURE OR MALFUNCTION, OR ANY OTHER DAMAGES OR LOSSES),
61
+ EVEN IF THE LICENSOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
PRIVACY.md ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Field | Response
2
+ :----------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------
3
+ Generatable or reverse engineerable personal data? | No
4
+ Personal data used to create this model? | No
5
+ Was consent obtained for any personal data used? | Not Applicable
6
+ How often is dataset reviewed? | Before Release
7
+ Was data from user interactions with the AI model (e.g. user input and prompts) used to train the model? | No
8
+ Is there provenance for all datasets used in training? | Yes
9
+ Does data labeling (annotation, metadata) comply with privacy laws? | Yes
10
+ Is data compliant with data subject requests for data correction or removal, if such a request was made? | No, not possible with externally-sourced data.
11
+ Applicable Privacy Policy | https://www.nvidia.com/en-us/about-nvidia/privacy-policy/
README.md ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Model Overview
2
+
3
+ ### Description
4
+ GR00T-N1.6-Rheo-PickNPlace is a vision language action model (VLA). This model is fine-tuned for preparing for surgical instruments handling in the Isaac for Healthcare Rheo workflow. It performs the pick‑and‑place of a sterilized box from a shelf to a cart using a G1 embodiment. This model is ready for commercial/non-commercial use.
5
+
6
+ ### License/Terms of Use
7
+ Nvidia License. Additional Information: [Apache-2.0 license for http://huggingface.co/Qwen/Qwen2.5-7B-Instruct and https://huggingface.co/google/siglip2-so400m-patch16-512.]
8
+
9
+ You are responsible for ensuring that your use of NVIDIA AI Foundation Models complies with all applicable laws.
10
+
11
+ ### Deployment Geography
12
+ Global
13
+
14
+ ### Use Case
15
+ This model is intended for Rheo simulation workflows focused on surgical instruments handling (sterilized box pick-and-place from shelf to cart). It is not intended for real-world clinical deployment.
16
+
17
+ ### Release Date
18
+ Hugging Face (03/10/2026) via https://huggingface.co/nvidia/GR00T-N1.6-Rheo-PickNPlaceTray/tree/main
19
+
20
+ ## Reference(s)
21
+ [Nvidia Isaac-GR00T N1.6](https://github.com/NVIDIA/Isaac-GR00T)
22
+ [Isaac For Healthcare](https://github.com/isaac-for-healthcare)
23
+
24
+ ## Model Architecture
25
+ **Architecture Type:** Vision Language Action model
26
+ **Network Architecture:** GR00T N1.6
27
+ **This model was developed based on** GR00T N1.6
28
+ **Number of model parameters:** 3 billion
29
+
30
+ ## Computational Load
31
+ **Cumulative Compute:** 2.45×10^19 FLOPs (hardware-based calculation using single NVIDIA H100 NVL for training)
32
+
33
+ **Estimated Energy and Emissions for Model Training:** 5.37 kWh, 0.00217 tCO₂e
34
+
35
+ ## Input(s)
36
+ **Input Type(s):** Vision, State, Language Instruction
37
+ **Input Format(s):**
38
+ - Vision: RGB images (uint8)
39
+ - State: Floating point
40
+ - Language Instruction: String
41
+
42
+ **Input Parameters:**
43
+ - Vision: Two-Dimensional (2D)
44
+ - State: One-Dimensional (1D)
45
+ - Language Instruction: One-Dimensional (1D)
46
+
47
+ **Other Properties Related to Input:**
48
+ - Vision: Raw 480x640 uint8 RGB frames from robot head camera; training preprocessing uses shortest_edge=256 with crop_fraction=0.95 (albumentations).
49
+ - State: 1x31 vector.
50
+
51
+ ## Output(s)
52
+ **Output Type(s):** Actions
53
+ **Output Format(s):** Continuous-value vectors
54
+ **Output Parameters:** Two-Dimensional (2D), 16x32 tensor
55
+ **Other Properties Related to Output:** Continuous-value vectors correspond to different motor controls on the robot embodiment.
56
+
57
+ Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g., GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
58
+
59
+ ## Software Integration
60
+ **Runtime Engine(s):** PyTorch 2.8.0
61
+
62
+ **Supported Operating System:**
63
+ - NVIDIA Ampere
64
+ - NVIDIA Blackwell
65
+ - NVIDIA Hopper
66
+
67
+ **Preferred/Supported Operating System(s):**
68
+ - Linux (Ubuntu 22.04/24.04 LTS)
69
+
70
+ ## Model Version(s)
71
+ GR00T-N1.6-Rheo-PickNPlace
72
+
73
+ ## Training Datasets, Testing, and Evaluation Datasets
74
+ Manual teleoperation and IsaacLab mimic generation.
75
+
76
+ ### Training Dataset
77
+ **Total Size:** 120 samples
78
+ **Text Training Data Size:** Less than a Billion Tokens
79
+ **Video Training Data Size:** Less than 10,000 Hours
80
+ **Non-Audio, Image, Text Training Data Size:**
81
+
82
+ Image/Video Data: RGB video frames from robot head camera (640x480 pixels)
83
+ Text Data: 120 language instruction strings by human labelling
84
+ Action Data: 120 episodes of robot action trajectories (state observations and action sequences)
85
+
86
+ **Data Modality:**
87
+ - Text
88
+ - Video
89
+ - Action
90
+
91
+ **Data Collection Method by dataset:** Automatic/Sensors
92
+ **Labeling Method by dataset:** Human
93
+
94
+ **Data Properties:**
95
+ Quantity: 120 simulation samples
96
+ Modalities: Multi-modal data consisting of (i) RGB video frames, (ii) text-based language instructions, (iii) robot state observations
97
+ Nature of Content: Data from Isaac Sim simulation environment collected in Isaac Lab mimic; no personal data or copyright-protected content; data represents surgical instrument manipulation tasks
98
+ Linguistic Characteristics: Language instructions describing surgical instrument prepartion
99
+
100
+ **Sensor(s):**
101
+ Vision sensors: RGB cameras (robot head-mounted) capturing 640x480 pixel images in simulation
102
+ Action sensors: Motor sensors on G1 embodiment
103
+
104
+
105
+ ### Testing Datasets
106
+ **Data Collection Method by dataset:** Not Applicable
107
+ **Labeling Method by dataset:** Not Applicable
108
+ **Data Properties:**
109
+ The evaluation was performed in simulation using the Isaac for Healthcare Rheo workflow. The testing data consists of dynamically generated episodes of the pick-and-place task.
110
+
111
+ ### Evaluation Datasets
112
+ **Data Collection Method by dataset:** Not Applicable
113
+ **Labeling Method by dataset:** Not Applicable
114
+ **Data Properties:**
115
+ The evaluation was performed in simulation using the Isaac for Healthcare Rheo workflow. The testing data consists of dynamically generated episodes of the pick-and-place task.
116
+
117
+ ## Inference
118
+ **Engine:** PyTorch
119
+ **Test Hardware:** NVIDIA RTX 5880 Ada Generation
120
+ **Inference mode / Latency / Memory:** PyTorch 92.4 ± 1.3 ms, 8 GB
121
+
122
+ ## Limitations
123
+ This model was trained on data from the Isaac for Healthcare Rheo workflow. Therefore, the model will only perform well in that specific operating room environment. This model is not expected to generalize to different robot platforms, environments, or surgical procedures outside of the trained domain.
124
+
125
+ ## Ethical Considerations
126
+ NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
127
+
128
+ For more detailed information on ethical considerations for this model, please see the Model Card++ Bias, Explainability, Safety & Security, and Privacy Subcards.
129
+
130
+ Please make sure you have proper rights and permissions for all input image and video content; if image or video includes people, personal health information, or intellectual property, the image or video generated will not blur or maintain proportions of image subjects included.
131
+
132
+ Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns [here](https://app.intigriti.com/programs/nvidia/nvidiavdp/detail).
SAFETY_and_SECURITY.md ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ Field | Response
2
+ :---------------------------------------------------|:----------------------------------
3
+ Model Application Field(s): | Healthcare, Medical Devices, Machinery and Robotics
4
+ Describe the life critical impact (if present). | This model is not tested or intended for use in mission critical applications that require functional safety. The use of the model in those applications is at the user's own risk and sole responsibility, including taking the necessary steps to add needed guardrails or safety mechanisms prior to deployment.
5
+ Use Case Restrictions: | Abide by Nvidia License
6
+ Model and dataset restrictions: | The Principle of Least Privilege (PoLP) is applied limiting access for dataset generation and model development. Restrictions enforce dataset access during training, and dataset license constraints adhered to.