Spaces:

asgard-robot
/

README

Configuration error

App Files Files Community

jj701 commited on Oct 26, 2025

Commit

784db84

verified ·

1 Parent(s): 049325a

Update README.md

Browse files

Files changed (1) hide show

README.md +253 -7

README.md CHANGED Viewed

@@ -1,10 +1,256 @@
 ---
-title: README
-emoji: 🚀
-colorFrom: purple
-colorTo: indigo
-sdk: static
-pinned: false
 ---
-Edit this `README.md` markdown file to author your organization card.

 ---
+emoji: 🤖
+license: mit
+model-cards:
+- asgard-robot/groot-potato-inference
+- asgard-robot/groot-condiment-handover
+datasets:
+- asgard-robot/asgard_training_data_potato
+- asgard-robot/asgard_training_data_condiment
+pipeline_tag: robotics
 ---
+# ASGARD Robot 🤖
+**Creating Intelligent Home Assistant Robots for Human-Robot Interaction**
+ASGARD (Autonomous Service Generation for Advanced Robot Deployment) is a research and development initiative focused on creating practical home assistant robots capable of safely interacting with humans in domestic environments.
+---
+## 🎯 Mission
+To develop autonomous robots that can:
+- **Handle everyday household tasks** safely and reliably
+- **Interact naturally** with humans in home environments
+- **Hand over objects** to humans with proper coordination and social awareness
+- **Adapt to diverse home environments** and situations
+---
+## 🏠 Focus Areas
+### 1. Home Environment Manipulation
+Our robots are designed to handle common household objects:
+- Food items (potatoes, condiments, containers)
+- Daily use objects (cups, utensils, small tools)
+- Delicate items requiring careful handling
+### 2. Human-Robot Handover
+Developing sophisticated coordination for:
+- **Gesture Recognition**: Understanding when and how humans want to receive objects
+- **Force Feedback**: Proper force control during handover to prevent accidents
+- **Timing Coordination**: Synchronizing robot and human movements
+- **Social Awareness**: Reading human intent and nonverbal cues
+### 3. Multi-Modal Understanding
+Our robots integrate:
+- **Vision**: Dual camera systems (wrist + external) for comprehensive scene understanding
+- **Touch**: Force/torque feedback for delicate manipulation
+- **Language**: Natural language understanding for task specification
+- **Context**: Awareness of household context and social norms
+---
+## 📊 Current Models
+### Trained GROOT Models
+#### 1. Potato Manipulation Model
+- **Model:** [groot-potato-inference](https://huggingface.co/asgard-robot/groot-potato-inference)
+- **Task:** Potato handling and cleaning in kitchen environments
+- **Checkpoint:** Step 2000
+- **Base Model:** NVIDIA GR00T N1.5-3B
+- **Robot:** ASGARD so101_follower (single-arm 6 DOF)
+- **Performance:** 99.53% loss reduction from initial training
+- **Dataset:** 40 episodes, 30,795 frames
+#### 2. Condiment Handover Model
+- **Model:** [groot-condiment-handover](https://huggingface.co/asgard-robot/groot-condiment-handover)
+- **Task:** Condiment bottle handling and handover to humans
+- **Checkpoint:** Step 2000
+- **Base Model:** NVIDIA GR00T N1.5-3B
+- **Robot:** ASGARD so101_follower (single-arm 6 DOF)
+- **Dataset:** 40 episodes, 31,522 frames
+- **Focus:** Human-robot coordination for object handover
+---
+## 🗂️ Datasets
+### Training Datasets
+#### 1. Potato Training Data
+- **Dataset:** [asgard_training_data_potato](https://huggingface.co/datasets/asgard-robot/asgard_training_data_potato)
+- **Type:** LeRobot v3.0 format
+- **Episodes:** 40 demonstrations
+- **Frames:** 30,795 (avg 770 per episode)
+- **Duration:** ~26 seconds per episode at 30 FPS
+- **Modalities:**
+  - Dual RGB cameras (wrist + realsense)
+  - 6 DOF joint positions
+  - Force feedback
+- **Task:** Potato manipulation and cleaning
+#### 2. Condiment Training Data
+- **Dataset:** [asgard_training_data_condiment](https://huggingface.co/datasets/asgard-robot/asgard_training_data_condiment)
+- **Type:** LeRobot v3.0 format
+- **Episodes:** 40 demonstrations
+- **Frames:** 31,522 (avg 788 per episode)
+- **Duration:** ~26 seconds per episode at 30 FPS
+- **Modalities:**
+  - Dual RGB cameras (wrist + realsense)
+  - 6 DOF joint positions
+  - Force feedback
+- **Task:** Condiment handling and human handover
+---
+## 🤖 Robot Platform
+### ASGARD so101_follower
+- **Type:** Single-arm manipulator
+- **Degrees of Freedom:** 6 (shoulder_pan, shoulder_lift, elbow_flex, wrist_flex, wrist_roll, gripper)
+- **Sensors:**
+  - Wrist-mounted RGB camera (640×480)
+  - External RGB camera (640×480)
+  - Force/torque sensors
+  - Joint position encoders
+- **Capabilities:**
+  - Precise object manipulation
+  - Force-controlled grasping
+  - Human-safe operation
+  - Real-time perception
+---
+## 🧠 Technology Stack
+### Base Models
+- **NVIDIA GR00T N1.5-3B**: Foundation model for robot manipulation
+  - Generalist robot foundation model
+  - Trained on diverse manipulation tasks
+  - Multi-modal understanding (vision + language + actions)
+  - Flow matching for continuous action generation
+### Training Framework
+- **LeRobot**: PyTorch-based robotics framework
+  - ASGARD teleop control branch
+  - GROOT policy support
+  - Dataset format v3.0
+  - Multi-GPU training with Hugging Face Accelerate
+### Hardware
+- **Training:** 4× NVIDIA H100 PCIe GPUs (80GB VRAM each)
+- **Inference:** Optimized for edge deployment
+- **Compute:** 320GB total VRAM for full fine-tuning
+---
+## 🔬 Research Goals
+### Short-Term
+1. **Robust Manipulation**: Reliable handling of diverse household objects
+2. **Safe Handover**: Zero accidents in human-robot handover scenarios
+3. **Context Awareness**: Understanding household context and social norms
+4. **Adaptation**: Quick adaptation to new objects and scenarios
+### Long-Term
+1. **General Household Assistance**: Cooking, cleaning, organization
+2. **Human-Robot Collaboration**: Seamless teamwork with humans
+3. **Learning from Demonstration**: Improved generalization from limited data
+4. **Real-Time Adaptation**: Dynamic adjustment to unexpected situations
+---
+## 🏗️ Architecture
+### Model Architecture
+Our models are fine-tuned from GR00T N1.5-3B:
+- **Frozen Components:**
+  - Vision encoder (preserves visual understanding)
+  - LLM (maintains language understanding)
+- **Trainable Components:**
+  - Diffusion transformer (action generation)
+  - Projector (vision-language → action mapping)
+### Training Strategy
+- **Full Fine-Tuning**: All trainable parameters updated
+- **Batch Size:** 512 (128 per GPU × 4 GPUs)
+- **Training Steps:** 2,000 per task
+- **Approx. Epochs:** ~33 (potato) / ~32 (condiment)
+- **Learning Rate:** 1e-4 with warmup
+- **Precision:** bf16 mixed precision
+---
+## 📈 Performance
+### Training Results
+Both models show excellent convergence:
+- **Loss Reduction:** 99%+ from initial to final
+- **Stability:** No overfitting observed
+- **Convergence:** Achieved around steps 1200-1600
+- **Final Loss:** ~0.006 (from initial ~1.2)
+### Metrics
+- **Training Time:** ~2 hours per model
+- **Memory Usage:** 60-70GB per GPU
+- **Throughput:** 2-3 samples/second per GPU
+- **Checkpoints:** 5 saved per training run (steps 400, 800, 1200, 1600, 2000)
+---
+## 🤝 Contributing
+We welcome contributions in:
+- Additional household task datasets
+- Improved handover algorithms
+- Multi-robot coordination
+- Human behavior modeling
+- Safety protocols
+---
+## 📚 Citations
+If you use our models or datasets, please cite:
+```bibtex
+@organization{asgard_robot_2024,
+  title={ASGARD Robot: Home Assistant Robot for Human-Robot Interaction},
+  author={ASGARD Team},
+  year={2024},
+  url={https://huggingface.co/asgard-robot},
+  models={groot-potato-inference, groot-condiment-handover},
+  datasets={asgard_training_data_potato, asgard_training_data_condiment}
+}
+```
+---
+## 📞 Contact
+- **Organization:** [asgard-robot](https://huggingface.co/asgard-robot)
+- **Models:** https://huggingface.co/asgard-robot
+- **Datasets:** https://huggingface.co/asgard-robot
+---
+## 🎖️ Acknowledgments
+- **Base Model:** NVIDIA GR00T N1.5-3B
+- **Framework:** LeRobot (Hugging Face)
+- **Hardware:** Shadeform H100 Multi-GPU Cluster
+- **Research:** ASGARD Team
+---
+## 🌟 Vision
+We envision a future where robots seamlessly integrate into home environments, assisting humans with daily tasks while maintaining the highest standards of safety, reliability, and social awareness. Our work focuses on practical applications that can improve quality of life and enable independent living.
+---
+**Building the future of home robotics, one handover at a time.** 🤖❤️