gr00t-wave / README.md

Upload README.md with huggingface_hub

8adad2e verified 4 months ago

4.63 kB

	---
	library_name: transformers
	pipeline_tag: robotics
	tags:
	- robotics
	- foundation-model
	- gr00t
	- dual-camera
	- robot-learning
	- manipulation
	- embodied-ai
	model_type: gr00t
	datasets:
	- so101_wave_300k_dualcam
	language:
	- en
	base_model_relation: finetune
	widget:
	- example_title: "Robot Manipulation"
	text: "Dual camera robotics control for manipulation tasks"
	---

	# GR00T Wave: Dual Camera Robotics Foundation Model

	## Model Overview

	GR00T Wave is a specialized robotics foundation model trained on dual-camera manipulation data from the SO101 Wave dataset. This model represents a significant advancement in robot learning, enabling sophisticated manipulation tasks through dual-camera visual input.

	## Key Features

	- Dual Camera Input: Processes synchronized dual-camera feeds for enhanced spatial understanding
	- Foundation Model Architecture: Built on the GR00T framework for robust robotics applications
	- 300K Training Steps: Extensive training on high-quality manipulation demonstrations
	- Manipulation Focused: Optimized for robotic manipulation and control tasks

	## Model Details

	- Model Type: GR00T Robotics Foundation Model
	- Training Data: SO101 Wave 300K Dual Camera Dataset
	- Architecture: Transformer-based with dual camera encoders
	- Training Steps: 300,000 steps with checkpoints at 150K and 300K
	- Input Modalities: Dual RGB cameras, robot state
	- Output: Robot actions and control commands

	## Usage

	```python
	from transformers import AutoModel, AutoTokenizer

	# Load the model
	model = AutoModel.from_pretrained("cagataydev/gr00t-wave", trust_remote_code=True)

	# Model is ready for robotics inference
	# Note: This model requires specialized robotics inference pipeline
	```

	## Training Configuration

	- Base Model: GR00T N1.5-3B
	- Dataset: SO101 Wave 300K Dual Camera
	- Training Framework: Custom robotics training pipeline
	- Batch Size: Optimized for dual camera inputs
	- Optimization: AdamW with custom learning rate scheduling

	## Model Files

	The repository contains:

	- SafeTensors Model Files:
	- `model-00001-of-00002.safetensors` (4.7GB)
	- `model-00002-of-00002.safetensors` (2.4GB)
	- Configuration Files:
	- `config.json`
	- `model.safetensors.index.json`
	- Training Checkpoints:
	- `checkpoint-150000/` (16GB)
	- `checkpoint-300000/` (16GB)
	- Training Metadata:
	- `trainer_state.json`
	- `training_args.bin`

	## Evaluation

	The model has been evaluated on standard robotics manipulation benchmarks with the following approach:

	- Evaluation Steps: 150 per checkpoint
	- Trajectory Count: 5 trajectories per evaluation
	- Data Configuration: SO100 dual camera setup
	- Metrics: Success rate, manipulation accuracy, and task completion

	## Applications

	This model is suitable for:

	- Robotic Manipulation: Pick and place operations
	- Dual Camera Systems: Tasks requiring stereo vision
	- Manufacturing Automation: Assembly and quality control
	- Research: Foundation for robotics research and development

	## Technical Specifications

	- Model Size: ~7.1GB (SafeTensors format)
	- Total Repository Size: ~40GB (including checkpoints)
	- Inference Requirements: GPU with sufficient VRAM for transformer inference
	- Framework Compatibility: Transformers, PyTorch

	## Installation

	```bash
	# Install required dependencies
	pip install transformers torch torchvision
	pip install huggingface_hub

	# Login to HuggingFace (required for private model)
	huggingface-cli login
	```

	## Limitations

	- Requires specialized robotics inference pipeline
	- Optimized for specific dual camera configurations
	- Performance may vary with different robot platforms
	- Requires adequate computational resources for real-time inference

	## Model Card

	This model card provides comprehensive information about the GR00T Wave model, including its capabilities, limitations, and intended use cases. The model represents current state-of-the-art in robotics foundation models with dual camera input.

	## Ethical Considerations

	This model is designed for robotics research and industrial applications. Users should ensure:

	- Safe deployment in robotics systems
	- Appropriate safety measures for physical robot control
	- Compliance with relevant safety standards
	- Responsible use in manufacturing and research environments

	## Version History

	- v1.0: Initial release with 300K step training
	- Checkpoints: Available at 150K and 300K training steps

	## Support

	For technical questions and implementation support, please refer to the model documentation and community resources.