Ryukijano commited on
Commit
17c92b9
·
verified ·
1 Parent(s): 9b98c92

Add Gemma-GR00T model weights

Browse files
Files changed (1) hide show
  1. README.md +35 -8
README.md CHANGED
@@ -3,6 +3,11 @@ language:
3
  - en
4
  license: mit
5
  library_name: transformers
 
 
 
 
 
6
  tags:
7
  - robotics
8
  - reinforcement-learning
@@ -10,7 +15,14 @@ tags:
10
  - gemma
11
  - gr00t
12
  - nvidia
13
- pipeline_tag: reinforcement-learning
 
 
 
 
 
 
 
14
  ---
15
 
16
  # Gemma-GR00T: Multimodal Robotic Manipulation with Language Models
@@ -26,7 +38,10 @@ Gemma-GR00T is a state-of-the-art multimodal vision-language-action policy that
26
  - **Language(s) (NLP):** English
27
  - **License:** MIT
28
  - **Finetuned from model:** [NVEagle/eagle_er-qwen3_1_7B-Siglip2_400M_stage1_5_128gpu_er_v7_1mlp_nops](https://huggingface.co/NVEagle/eagle_er-qwen3_1_7B-Siglip2_400M_stage1_5_128gpu_er_v7_1mlp_nops)
29
- - **Training Data:** Trained on the LeRobot dataset using the `fourier_gr1_arms_only` configuration
 
 
 
30
 
31
  ### Model Architecture
32
 
@@ -43,11 +58,19 @@ Gemma-GR00T is a state-of-the-art multimodal vision-language-action policy that
43
 
44
  ### Direct Use
45
 
46
- This model is intended for research and development of robotic manipulation systems. It can be used for:
47
- - Robotic arm manipulation tasks
48
- - Sim-to-real transfer learning
49
- - Multimodal robotic control
50
- - Research in reinforcement and imitation learning
 
 
 
 
 
 
 
 
51
 
52
  ### Out-of-Scope Use
53
 
@@ -97,8 +120,12 @@ def run_inference(observation, language_instruction):
97
 
98
  This model was trained using the [LeRobot](https://github.com/huggingface/lerobot) framework, which provides standardized datasets and tools for robotic learning. The training utilized the following configuration:
99
 
100
- - **Dataset:** LeRobot's standardized robotic manipulation dataset
 
 
101
  - **Data Configuration:** `fourier_gr1_arms_only`
 
 
102
  - **Environment:** [Isaac Sim](https://developer.nvidia.com/isaac-sim)
103
  - **Training Steps:** 30,000
104
  - **Batch Size:** 32
 
3
  - en
4
  license: mit
5
  library_name: transformers
6
+ pipeline_tag: reinforcement-learning
7
+ datasets:
8
+ - lerobot/robot_sim.PickNPlace
9
+ - lerobot/so100_strawberry_grape
10
+ base_model: NVEagle/eagle_er-qwen3_1_7B-Siglip2_400M_stage1_5_128gpu_er_v7_1mlp_nops
11
  tags:
12
  - robotics
13
  - reinforcement-learning
 
15
  - gemma
16
  - gr00t
17
  - nvidia
18
+ - lerobot
19
+ - vision-language-action
20
+ - robot-manipulation
21
+ - gemma-le
22
+ - diffusion-policy
23
+ - le-robot
24
+ - robot-learning
25
+ - embodied-ai
26
  ---
27
 
28
  # Gemma-GR00T: Multimodal Robotic Manipulation with Language Models
 
38
  - **Language(s) (NLP):** English
39
  - **License:** MIT
40
  - **Finetuned from model:** [NVEagle/eagle_er-qwen3_1_7B-Siglip2_400M_stage1_5_128gpu_er_v7_1mlp_nops](https://huggingface.co/NVEagle/eagle_er-qwen3_1_7B-Siglip2_400M_stage1_5_128gpu_er_v7_1mlp_nops)
41
+ - **Training Data:** Trained on LeRobot datasets using the `fourier_gr1_arms_only` configuration
42
+ - **Framework:** PyTorch with Hugging Face Transformers
43
+ - **Related Models:** [LeRobot Models](https://huggingface.co/lerobot)
44
+ - **Related Datasets:** [LeRobot Datasets](https://huggingface.co/lerobot/datasets)
45
 
46
  ### Model Architecture
47
 
 
58
 
59
  ### Direct Use
60
 
61
+ This model is part of the [Gemma-GR00T](https://github.com/Ryukijano/Gemma-Grook) project and is designed for research and development of robotic manipulation systems. It can be used for:
62
+
63
+ - Robotic arm manipulation tasks (pick-and-place, assembly, etc.)
64
+ - Sim-to-real transfer learning in robotics
65
+ - Multimodal robotic control with natural language instructions
66
+ - Research in reinforcement and imitation learning for robotics
67
+ - Integration with the [LeRobot](https://github.com/huggingface/lerobot) ecosystem
68
+
69
+ ### Related Projects
70
+
71
+ - [LeRobot](https://github.com/huggingface/lerobot): The base framework used for training
72
+ - [GR00T](https://developer.nvidia.com/gr00t): NVIDIA's foundation model for humanoid robots
73
+ - [Gemma](https://huggingface.co/google/gemma-7b): The language model backbone
74
 
75
  ### Out-of-Scope Use
76
 
 
120
 
121
  This model was trained using the [LeRobot](https://github.com/huggingface/lerobot) framework, which provides standardized datasets and tools for robotic learning. The training utilized the following configuration:
122
 
123
+ - **Primary Datasets:**
124
+ - `lerobot/robot_sim.PickNPlace`: Simulated pick and place tasks
125
+ - `lerobot/so100_strawberry_grape`: Real-world manipulation tasks
126
  - **Data Configuration:** `fourier_gr1_arms_only`
127
+ - **Dataset Documentation:** [LeRobot Datasets](https://huggingface.co/lerobot/datasets)
128
+ - **Data Processing:** Follows LeRobot's standardized data pipeline for consistency with other models in the ecosystem
129
  - **Environment:** [Isaac Sim](https://developer.nvidia.com/isaac-sim)
130
  - **Training Steps:** 30,000
131
  - **Batch Size:** 32