Spaces:

gpue
/

foundationpose

Sleeping

App Files Files Community

foundationpose / README.md

Georg

Fix Dockerfile build errors and always use real model

dd44013 8 days ago

preview code

raw

history blame contribute delete

4.02 kB

metadata

title: FoundationPose Inference
emoji: 🎯
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
tags:
  - computer-vision
  - 6d-pose
  - object-detection
  - robotics
  - foundationpose

FoundationPose Inference Server

This Hugging Face Space provides 6D object pose estimation using FoundationPose with GPU support via Docker.

Features

6D Pose Estimation: Detect object position and orientation in 3D space
Reference-based Tracking: Register objects using multiple reference images
REST API: Easy integration with robotics pipelines
ZeroGPU: On-demand GPU allocation for efficient inference

Usage

Web Interface

Initialize Tab: Upload reference images of your object from different angles (16-20 recommended)
Estimate Tab: Upload a query image to detect the object's 6D pose

HTTP API

Initialize Object

curl -X POST https://gpue-foundationpose.hf.space/api/initialize \
  -H "Content-Type: application/json" \
  -d '{
    "object_id": "target_cube",
    "reference_images_b64": ["<base64-jpeg>", ...],
    "camera_intrinsics": "{\"fx\": 500, \"fy\": 500, \"cx\": 320, \"cy\": 240}"
  }'

Estimate Pose

curl -X POST https://gpue-foundationpose.hf.space/api/estimate \
  -H "Content-Type: application/json" \
  -d '{
    "object_id": "target_cube",
    "query_image_b64": "<base64-jpeg>",
    "camera_intrinsics": "{\"fx\": 500, \"fy\": 500, \"cx\": 320, \"cy\": 240}"
  }'

Response Format

{
  "success": true,
  "poses": [
    {
      "object_id": "target_cube",
      "position": {"x": 0.5, "y": 0.3, "z": 0.1},
      "orientation": {"w": 1.0, "x": 0.0, "y": 0.0, "z": 0.0},
      "confidence": 0.95,
      "dimensions": [0.1, 0.1, 0.1]
    }
  ]
}

Integration with robot-ml

This Space is designed to work with the robot-ml training pipeline:

Capture reference images: make capture-reference

Configure perception in observations.yaml:

perception:
  enabled: true
  model: foundation_pose
  api_url: https://gpue-foundationpose.hf.space

Run training with perception: make train

Setup

Placeholder Mode (Default)

This Space runs in placeholder mode by default - the API works but returns empty pose results. Perfect for testing integrations!

Enable Real Inference

To enable actual 6D pose estimation:

Create a Hugging Face model repository to host the weights (recommended)
- 📖 Quick guide: HF_MODEL_SETUP.md
- 📖 Detailed guide: UPLOAD_WEIGHTS.md

Set environment variables in this Space's settings:

FOUNDATIONPOSE_MODEL_REPO=YOUR_USERNAME/foundationpose-weights
USE_HF_WEIGHTS=true
USE_REAL_MODEL=true

Restart the Space - weights will download automatically!

Why use a model repo? Faster downloads, version control, share across Spaces, no git-lfs needed!

Performance

Cold Start: 15-30 seconds (ZeroGPU allocation + model loading)
Warm Inference: 0.5-2 seconds per query
Recommended Use: Batch processing, validation, demos

For real-time training loops (30 Hz), use the local dummy estimator instead.

Documentation

🚀 README_SETUP.md - Start here for setup
⚡ QUICKSTART.md - API usage & integration examples
📦 HF_MODEL_SETUP.md - 5-minute model repo setup
📖 UPLOAD_WEIGHTS.md - Detailed weight upload guide
🔧 DEPLOYMENT.md - Full deployment options
📊 STATUS.md - Complete project status

Citation

@inproceedings{wen2023foundationpose,
  title={FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects},
  author={Wen, Bowen and Yang, Wei and Kautz, Jan and Birchfield, Stan},
  booktitle={CVPR},
  year={2024}
}