Spaces:

gpue
/

foundationpose

Sleeping

File size: 4,015 Bytes

---
title: FoundationPose Inference
emoji: 🎯
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
tags:
  - computer-vision
  - 6d-pose
  - object-detection
  - robotics
  - foundationpose
---

# FoundationPose Inference Server

This Hugging Face Space provides 6D object pose estimation using [FoundationPose](https://github.com/NVlabs/FoundationPose) with GPU support via Docker.

## Features

- **6D Pose Estimation**: Detect object position and orientation in 3D space
- **Reference-based Tracking**: Register objects using multiple reference images
- **REST API**: Easy integration with robotics pipelines
- **ZeroGPU**: On-demand GPU allocation for efficient inference

## Usage

### Web Interface

1. **Initialize Tab**: Upload reference images of your object from different angles (16-20 recommended)
2. **Estimate Tab**: Upload a query image to detect the object's 6D pose

### HTTP API

#### Initialize Object

```bash
curl -X POST https://gpue-foundationpose.hf.space/api/initialize \
  -H "Content-Type: application/json" \
  -d '{
    "object_id": "target_cube",
    "reference_images_b64": ["<base64-jpeg>", ...],
    "camera_intrinsics": "{\"fx\": 500, \"fy\": 500, \"cx\": 320, \"cy\": 240}"
  }'
```

#### Estimate Pose

```bash
curl -X POST https://gpue-foundationpose.hf.space/api/estimate \
  -H "Content-Type: application/json" \
  -d '{
    "object_id": "target_cube",
    "query_image_b64": "<base64-jpeg>",
    "camera_intrinsics": "{\"fx\": 500, \"fy\": 500, \"cx\": 320, \"cy\": 240}"
  }'
```

#### Response Format

```json
{
  "success": true,
  "poses": [
    {
      "object_id": "target_cube",
      "position": {"x": 0.5, "y": 0.3, "z": 0.1},
      "orientation": {"w": 1.0, "x": 0.0, "y": 0.0, "z": 0.0},
      "confidence": 0.95,
      "dimensions": [0.1, 0.1, 0.1]
    }
  ]
}
```

## Integration with robot-ml

This Space is designed to work with the [robot-ml](https://github.com/gpuschel/robot-ml) training pipeline:

1. Capture reference images: `make capture-reference`
2. Configure perception in `observations.yaml`:
   ```yaml
   perception:
     enabled: true
     model: foundation_pose
     api_url: https://gpue-foundationpose.hf.space
   ```
3. Run training with perception: `make train`

## Setup

### Placeholder Mode (Default)

This Space runs in **placeholder mode** by default - the API works but returns empty pose results. Perfect for testing integrations!

### Enable Real Inference

To enable actual 6D pose estimation:

1. **Create a Hugging Face model repository** to host the weights (recommended)
   - 📖 Quick guide: [HF_MODEL_SETUP.md](HF_MODEL_SETUP.md)
   - 📖 Detailed guide: [UPLOAD_WEIGHTS.md](UPLOAD_WEIGHTS.md)

2. **Set environment variables** in this Space's settings:
   ```
   FOUNDATIONPOSE_MODEL_REPO=YOUR_USERNAME/foundationpose-weights
   USE_HF_WEIGHTS=true
   USE_REAL_MODEL=true
   ```

3. **Restart the Space** - weights will download automatically!

**Why use a model repo?** Faster downloads, version control, share across Spaces, no git-lfs needed!

## Performance

- **Cold Start**: 15-30 seconds (ZeroGPU allocation + model loading)
- **Warm Inference**: 0.5-2 seconds per query
- **Recommended Use**: Batch processing, validation, demos

For real-time training loops (30 Hz), use the local dummy estimator instead.

## Documentation

- 🚀 [README_SETUP.md](README_SETUP.md) - Start here for setup
- ⚡ [QUICKSTART.md](QUICKSTART.md) - API usage & integration examples
- 📦 [HF_MODEL_SETUP.md](HF_MODEL_SETUP.md) - 5-minute model repo setup
- 📖 [UPLOAD_WEIGHTS.md](UPLOAD_WEIGHTS.md) - Detailed weight upload guide
- 🔧 [DEPLOYMENT.md](DEPLOYMENT.md) - Full deployment options
- 📊 [STATUS.md](STATUS.md) - Complete project status

## Citation

```bibtex
@inproceedings{wen2023foundationpose,
  title={FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects},
  author={Wen, Bowen and Yang, Wei and Kautz, Jan and Birchfield, Stan},
  booktitle={CVPR},
  year={2024}
}
```