--- title: FoundationPose Inference emoji: 🎯 colorFrom: blue colorTo: purple sdk: docker app_port: 7860 pinned: false tags: - computer-vision - 6d-pose - object-detection - robotics - foundationpose --- # FoundationPose Inference Server This Hugging Face Space provides 6D object pose estimation using [FoundationPose](https://github.com/NVlabs/FoundationPose) with GPU support via Docker. ## Features - **6D Pose Estimation**: Detect object position and orientation in 3D space - **Reference-based Tracking**: Register objects using multiple reference images - **REST API**: Easy integration with robotics pipelines - **ZeroGPU**: On-demand GPU allocation for efficient inference ## Usage ### Web Interface 1. **Initialize Tab**: Upload reference images of your object from different angles (16-20 recommended) 2. **Estimate Tab**: Upload a query image to detect the object's 6D pose ### HTTP API #### Initialize Object ```bash curl -X POST https://gpue-foundationpose.hf.space/api/initialize \ -H "Content-Type: application/json" \ -d '{ "object_id": "target_cube", "reference_images_b64": ["", ...], "camera_intrinsics": "{\"fx\": 500, \"fy\": 500, \"cx\": 320, \"cy\": 240}" }' ``` #### Estimate Pose ```bash curl -X POST https://gpue-foundationpose.hf.space/api/estimate \ -H "Content-Type: application/json" \ -d '{ "object_id": "target_cube", "query_image_b64": "", "camera_intrinsics": "{\"fx\": 500, \"fy\": 500, \"cx\": 320, \"cy\": 240}" }' ``` #### Response Format ```json { "success": true, "poses": [ { "object_id": "target_cube", "position": {"x": 0.5, "y": 0.3, "z": 0.1}, "orientation": {"w": 1.0, "x": 0.0, "y": 0.0, "z": 0.0}, "confidence": 0.95, "dimensions": [0.1, 0.1, 0.1] } ] } ``` ## Integration with robot-ml This Space is designed to work with the [robot-ml](https://github.com/gpuschel/robot-ml) training pipeline: 1. Capture reference images: `make capture-reference` 2. Configure perception in `observations.yaml`: ```yaml perception: enabled: true model: foundation_pose api_url: https://gpue-foundationpose.hf.space ``` 3. Run training with perception: `make train` ## Setup ### Placeholder Mode (Default) This Space runs in **placeholder mode** by default - the API works but returns empty pose results. Perfect for testing integrations! ### Enable Real Inference To enable actual 6D pose estimation: 1. **Create a Hugging Face model repository** to host the weights (recommended) - 📖 Quick guide: [HF_MODEL_SETUP.md](HF_MODEL_SETUP.md) - 📖 Detailed guide: [UPLOAD_WEIGHTS.md](UPLOAD_WEIGHTS.md) 2. **Set environment variables** in this Space's settings: ``` FOUNDATIONPOSE_MODEL_REPO=YOUR_USERNAME/foundationpose-weights USE_HF_WEIGHTS=true USE_REAL_MODEL=true ``` 3. **Restart the Space** - weights will download automatically! **Why use a model repo?** Faster downloads, version control, share across Spaces, no git-lfs needed! ## Performance - **Cold Start**: 15-30 seconds (ZeroGPU allocation + model loading) - **Warm Inference**: 0.5-2 seconds per query - **Recommended Use**: Batch processing, validation, demos For real-time training loops (30 Hz), use the local dummy estimator instead. ## Documentation - 🚀 [README_SETUP.md](README_SETUP.md) - Start here for setup - ⚡ [QUICKSTART.md](QUICKSTART.md) - API usage & integration examples - 📦 [HF_MODEL_SETUP.md](HF_MODEL_SETUP.md) - 5-minute model repo setup - 📖 [UPLOAD_WEIGHTS.md](UPLOAD_WEIGHTS.md) - Detailed weight upload guide - 🔧 [DEPLOYMENT.md](DEPLOYMENT.md) - Full deployment options - 📊 [STATUS.md](STATUS.md) - Complete project status ## Citation ```bibtex @inproceedings{wen2023foundationpose, title={FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects}, author={Wen, Bowen and Yang, Wei and Kautz, Jan and Birchfield, Stan}, booktitle={CVPR}, year={2024} } ```