foundationpose / README.md
Georg
Fix Dockerfile build errors and always use real model
dd44013
---
title: FoundationPose Inference
emoji: 🎯
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
tags:
- computer-vision
- 6d-pose
- object-detection
- robotics
- foundationpose
---
# FoundationPose Inference Server
This Hugging Face Space provides 6D object pose estimation using [FoundationPose](https://github.com/NVlabs/FoundationPose) with GPU support via Docker.
## Features
- **6D Pose Estimation**: Detect object position and orientation in 3D space
- **Reference-based Tracking**: Register objects using multiple reference images
- **REST API**: Easy integration with robotics pipelines
- **ZeroGPU**: On-demand GPU allocation for efficient inference
## Usage
### Web Interface
1. **Initialize Tab**: Upload reference images of your object from different angles (16-20 recommended)
2. **Estimate Tab**: Upload a query image to detect the object's 6D pose
### HTTP API
#### Initialize Object
```bash
curl -X POST https://gpue-foundationpose.hf.space/api/initialize \
-H "Content-Type: application/json" \
-d '{
"object_id": "target_cube",
"reference_images_b64": ["<base64-jpeg>", ...],
"camera_intrinsics": "{\"fx\": 500, \"fy\": 500, \"cx\": 320, \"cy\": 240}"
}'
```
#### Estimate Pose
```bash
curl -X POST https://gpue-foundationpose.hf.space/api/estimate \
-H "Content-Type: application/json" \
-d '{
"object_id": "target_cube",
"query_image_b64": "<base64-jpeg>",
"camera_intrinsics": "{\"fx\": 500, \"fy\": 500, \"cx\": 320, \"cy\": 240}"
}'
```
#### Response Format
```json
{
"success": true,
"poses": [
{
"object_id": "target_cube",
"position": {"x": 0.5, "y": 0.3, "z": 0.1},
"orientation": {"w": 1.0, "x": 0.0, "y": 0.0, "z": 0.0},
"confidence": 0.95,
"dimensions": [0.1, 0.1, 0.1]
}
]
}
```
## Integration with robot-ml
This Space is designed to work with the [robot-ml](https://github.com/gpuschel/robot-ml) training pipeline:
1. Capture reference images: `make capture-reference`
2. Configure perception in `observations.yaml`:
```yaml
perception:
enabled: true
model: foundation_pose
api_url: https://gpue-foundationpose.hf.space
```
3. Run training with perception: `make train`
## Setup
### Placeholder Mode (Default)
This Space runs in **placeholder mode** by default - the API works but returns empty pose results. Perfect for testing integrations!
### Enable Real Inference
To enable actual 6D pose estimation:
1. **Create a Hugging Face model repository** to host the weights (recommended)
- 📖 Quick guide: [HF_MODEL_SETUP.md](HF_MODEL_SETUP.md)
- 📖 Detailed guide: [UPLOAD_WEIGHTS.md](UPLOAD_WEIGHTS.md)
2. **Set environment variables** in this Space's settings:
```
FOUNDATIONPOSE_MODEL_REPO=YOUR_USERNAME/foundationpose-weights
USE_HF_WEIGHTS=true
USE_REAL_MODEL=true
```
3. **Restart the Space** - weights will download automatically!
**Why use a model repo?** Faster downloads, version control, share across Spaces, no git-lfs needed!
## Performance
- **Cold Start**: 15-30 seconds (ZeroGPU allocation + model loading)
- **Warm Inference**: 0.5-2 seconds per query
- **Recommended Use**: Batch processing, validation, demos
For real-time training loops (30 Hz), use the local dummy estimator instead.
## Documentation
- 🚀 [README_SETUP.md](README_SETUP.md) - Start here for setup
- ⚡ [QUICKSTART.md](QUICKSTART.md) - API usage & integration examples
- 📦 [HF_MODEL_SETUP.md](HF_MODEL_SETUP.md) - 5-minute model repo setup
- 📖 [UPLOAD_WEIGHTS.md](UPLOAD_WEIGHTS.md) - Detailed weight upload guide
- 🔧 [DEPLOYMENT.md](DEPLOYMENT.md) - Full deployment options
- 📊 [STATUS.md](STATUS.md) - Complete project status
## Citation
```bibtex
@inproceedings{wen2023foundationpose,
title={FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects},
author={Wen, Bowen and Yang, Wei and Kautz, Jan and Birchfield, Stan},
booktitle={CVPR},
year={2024}
}
```