gr00t-wave / README.md
cagataydev's picture
Upload README.md with huggingface_hub
8adad2e verified
---
library_name: transformers
pipeline_tag: robotics
tags:
- robotics
- foundation-model
- gr00t
- dual-camera
- robot-learning
- manipulation
- embodied-ai
model_type: gr00t
datasets:
- so101_wave_300k_dualcam
language:
- en
base_model_relation: finetune
widget:
- example_title: "Robot Manipulation"
text: "Dual camera robotics control for manipulation tasks"
---
# GR00T Wave: Dual Camera Robotics Foundation Model
## Model Overview
GR00T Wave is a specialized robotics foundation model trained on dual-camera manipulation data from the SO101 Wave dataset. This model represents a significant advancement in robot learning, enabling sophisticated manipulation tasks through dual-camera visual input.
## Key Features
- **Dual Camera Input**: Processes synchronized dual-camera feeds for enhanced spatial understanding
- **Foundation Model Architecture**: Built on the GR00T framework for robust robotics applications
- **300K Training Steps**: Extensive training on high-quality manipulation demonstrations
- **Manipulation Focused**: Optimized for robotic manipulation and control tasks
## Model Details
- **Model Type**: GR00T Robotics Foundation Model
- **Training Data**: SO101 Wave 300K Dual Camera Dataset
- **Architecture**: Transformer-based with dual camera encoders
- **Training Steps**: 300,000 steps with checkpoints at 150K and 300K
- **Input Modalities**: Dual RGB cameras, robot state
- **Output**: Robot actions and control commands
## Usage
```python
from transformers import AutoModel, AutoTokenizer
# Load the model
model = AutoModel.from_pretrained("cagataydev/gr00t-wave", trust_remote_code=True)
# Model is ready for robotics inference
# Note: This model requires specialized robotics inference pipeline
```
## Training Configuration
- **Base Model**: GR00T N1.5-3B
- **Dataset**: SO101 Wave 300K Dual Camera
- **Training Framework**: Custom robotics training pipeline
- **Batch Size**: Optimized for dual camera inputs
- **Optimization**: AdamW with custom learning rate scheduling
## Model Files
The repository contains:
- **SafeTensors Model Files**:
- `model-00001-of-00002.safetensors` (4.7GB)
- `model-00002-of-00002.safetensors` (2.4GB)
- **Configuration Files**:
- `config.json`
- `model.safetensors.index.json`
- **Training Checkpoints**:
- `checkpoint-150000/` (16GB)
- `checkpoint-300000/` (16GB)
- **Training Metadata**:
- `trainer_state.json`
- `training_args.bin`
## Evaluation
The model has been evaluated on standard robotics manipulation benchmarks with the following approach:
- **Evaluation Steps**: 150 per checkpoint
- **Trajectory Count**: 5 trajectories per evaluation
- **Data Configuration**: SO100 dual camera setup
- **Metrics**: Success rate, manipulation accuracy, and task completion
## Applications
This model is suitable for:
- **Robotic Manipulation**: Pick and place operations
- **Dual Camera Systems**: Tasks requiring stereo vision
- **Manufacturing Automation**: Assembly and quality control
- **Research**: Foundation for robotics research and development
## Technical Specifications
- **Model Size**: ~7.1GB (SafeTensors format)
- **Total Repository Size**: ~40GB (including checkpoints)
- **Inference Requirements**: GPU with sufficient VRAM for transformer inference
- **Framework Compatibility**: Transformers, PyTorch
## Installation
```bash
# Install required dependencies
pip install transformers torch torchvision
pip install huggingface_hub
# Login to HuggingFace (required for private model)
huggingface-cli login
```
## Limitations
- Requires specialized robotics inference pipeline
- Optimized for specific dual camera configurations
- Performance may vary with different robot platforms
- Requires adequate computational resources for real-time inference
## Model Card
This model card provides comprehensive information about the GR00T Wave model, including its capabilities, limitations, and intended use cases. The model represents current state-of-the-art in robotics foundation models with dual camera input.
## Ethical Considerations
This model is designed for robotics research and industrial applications. Users should ensure:
- Safe deployment in robotics systems
- Appropriate safety measures for physical robot control
- Compliance with relevant safety standards
- Responsible use in manufacturing and research environments
## Version History
- **v1.0**: Initial release with 300K step training
- **Checkpoints**: Available at 150K and 300K training steps
## Support
For technical questions and implementation support, please refer to the model documentation and community resources.