Ryukijano
/

gemma-groot

@@ -3,6 +3,11 @@ language:
 - en
 license: mit
 library_name: transformers
 tags:
 - robotics
 - reinforcement-learning
@@ -10,7 +15,14 @@ tags:
 - gemma
 - gr00t
 - nvidia
-pipeline_tag: reinforcement-learning
 ---
 # Gemma-GR00T: Multimodal Robotic Manipulation with Language Models
@@ -26,7 +38,10 @@ Gemma-GR00T is a state-of-the-art multimodal vision-language-action policy that
 - **Language(s) (NLP):** English
 - **License:** MIT
 - **Finetuned from model:** [NVEagle/eagle_er-qwen3_1_7B-Siglip2_400M_stage1_5_128gpu_er_v7_1mlp_nops](https://huggingface.co/NVEagle/eagle_er-qwen3_1_7B-Siglip2_400M_stage1_5_128gpu_er_v7_1mlp_nops)
-- **Training Data:** Trained on the LeRobot dataset using the `fourier_gr1_arms_only` configuration
 ### Model Architecture
@@ -43,11 +58,19 @@ Gemma-GR00T is a state-of-the-art multimodal vision-language-action policy that
 ### Direct Use
-This model is intended for research and development of robotic manipulation systems. It can be used for:
-- Robotic arm manipulation tasks
-- Sim-to-real transfer learning
-- Multimodal robotic control
-- Research in reinforcement and imitation learning
 ### Out-of-Scope Use
@@ -97,8 +120,12 @@ def run_inference(observation, language_instruction):
 This model was trained using the [LeRobot](https://github.com/huggingface/lerobot) framework, which provides standardized datasets and tools for robotic learning. The training utilized the following configuration:
-- **Dataset:** LeRobot's standardized robotic manipulation dataset
 - **Data Configuration:** `fourier_gr1_arms_only`
 - **Environment:** [Isaac Sim](https://developer.nvidia.com/isaac-sim)
 - **Training Steps:** 30,000
 - **Batch Size:** 32

 - en
 license: mit
 library_name: transformers
+pipeline_tag: reinforcement-learning
+datasets:
+- lerobot/robot_sim.PickNPlace
+- lerobot/so100_strawberry_grape
+base_model: NVEagle/eagle_er-qwen3_1_7B-Siglip2_400M_stage1_5_128gpu_er_v7_1mlp_nops
 tags:
 - robotics
 - reinforcement-learning
 - gemma
 - gr00t
 - nvidia
+- lerobot
+- vision-language-action
+- robot-manipulation
+- gemma-le
+- diffusion-policy
+- le-robot
+- robot-learning
+- embodied-ai
 ---
 # Gemma-GR00T: Multimodal Robotic Manipulation with Language Models
 - **Language(s) (NLP):** English
 - **License:** MIT
 - **Finetuned from model:** [NVEagle/eagle_er-qwen3_1_7B-Siglip2_400M_stage1_5_128gpu_er_v7_1mlp_nops](https://huggingface.co/NVEagle/eagle_er-qwen3_1_7B-Siglip2_400M_stage1_5_128gpu_er_v7_1mlp_nops)
+- **Training Data:** Trained on LeRobot datasets using the `fourier_gr1_arms_only` configuration
+- **Framework:** PyTorch with Hugging Face Transformers
+- **Related Models:** [LeRobot Models](https://huggingface.co/lerobot)
+- **Related Datasets:** [LeRobot Datasets](https://huggingface.co/lerobot/datasets)
 ### Model Architecture
 ### Direct Use
+This model is part of the [Gemma-GR00T](https://github.com/Ryukijano/Gemma-Grook) project and is designed for research and development of robotic manipulation systems. It can be used for:
+- Robotic arm manipulation tasks (pick-and-place, assembly, etc.)
+- Sim-to-real transfer learning in robotics
+- Multimodal robotic control with natural language instructions
+- Research in reinforcement and imitation learning for robotics
+- Integration with the [LeRobot](https://github.com/huggingface/lerobot) ecosystem
+### Related Projects
+- [LeRobot](https://github.com/huggingface/lerobot): The base framework used for training
+- [GR00T](https://developer.nvidia.com/gr00t): NVIDIA's foundation model for humanoid robots
+- [Gemma](https://huggingface.co/google/gemma-7b): The language model backbone
 ### Out-of-Scope Use
 This model was trained using the [LeRobot](https://github.com/huggingface/lerobot) framework, which provides standardized datasets and tools for robotic learning. The training utilized the following configuration:
+- **Primary Datasets:**
+  - `lerobot/robot_sim.PickNPlace`: Simulated pick and place tasks
+  - `lerobot/so100_strawberry_grape`: Real-world manipulation tasks
 - **Data Configuration:** `fourier_gr1_arms_only`
+- **Dataset Documentation:** [LeRobot Datasets](https://huggingface.co/lerobot/datasets)
+- **Data Processing:** Follows LeRobot's standardized data pipeline for consistency with other models in the ecosystem
 - **Environment:** [Isaac Sim](https://developer.nvidia.com/isaac-sim)
 - **Training Steps:** 30,000
 - **Batch Size:** 32