Spaces:

Sathvik0101
/

obj_localizer

Running

File size: 6,262 Bytes

---
title: SpaceDebris Localizer
emoji: 🛰️
colorFrom: blue
colorTo: gray
sdk: gradio
sdk_version: 5.12.0
app_file: app.py
pinned: true
license: mit
---

# SpaceDebris Localizer

Use **NVIDIA LocateAnything-3B** to locate space debris, satellite fragments, and spacecraft components in orbital imagery.

Orbital debris is a growing threat to satellite operations and crewed spaceflight. This project demonstrates how state-of-the-art vision-language grounding models can be applied to identify and localize objects in space imagery — from satellite solar panels and antennas to rocket bodies and debris fields. Built as a Hugging Face Spaces application, it provides a natural-language interface: describe what you're looking for, and the model draws bounding boxes around matching objects in the image.

## Why This Matters

There are over 36,000 tracked objects in Earth orbit, and millions of smaller fragments too tiny to track. Traditional detection pipelines require specialized training data and domain-specific models. Vision-language grounding models like LocateAnything-3B offer a different approach: describe the target in natural language and let the model find it. This prototype explores whether general-purpose visual grounding can serve as a rapid-deployment tool for orbital debris awareness, satellite inspection, and space situational awareness workflows.

## Architecture

```
User uploads image + text prompt
         │
         ▼
┌─────────────────────┐
│   Gradio Interface   │
│   (app.py)           │
└────────┬────────────┘
         │
         ▼
┌─────────────────────┐
│  LocateAnythingWorker│
│  (src/inference.py)  │
│  ┌─────────────────┐│
│  │ nvidia/          ││
│  │ LocateAnything-  ││
│  │ 3B (3B params)   ││
│  └─────────────────┘│
└────────┬────────────┘
         │ raw text with <box> tokens
         ▼
┌─────────────────────┐
│  Output Parser       │
│  (src/parsing.py)    │
│  Regex → BBox list   │
└────────┬────────────┘
         │ structured BBox objects
         ▼
┌─────────────────────┐
│  Visualizer          │
│  (src/visualization) │
│  Draw boxes + labels │
└────────┬────────────┘
         │
         ▼
   Annotated image + JSON metadata
```

## Setup

### Prerequisites

- Python 3.10+
- CUDA-capable GPU (recommended) or CPU (slow)
- ~8GB GPU memory for bfloat16 inference

### Local Installation

```bash
git clone https://github.com/YOUR_USERNAME/space-debris-localizer.git
cd space-debris-localizer
pip install -e ".[dev]"
```

### Run Locally

```bash
python app.py
```

The app launches at `http://localhost:7860`. First run downloads the model (~6GB).

### Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `MODEL_ID` | `nvidia/LocateAnything-3B` | HuggingFace model ID |
| `DEVICE` | `cuda` | Device (`cuda` or `cpu`) |
| `DTYPE` | `bfloat16` | Model precision |
| `MAX_NEW_TOKENS` | `8192` | Max generation tokens |
| `GENERATION_MODE` | `hybrid` | `fast`, `slow`, or `hybrid` |
| `PORT` | `7860` | Gradio server port |

## Deployment to Hugging Face Spaces

### Automatic Sync via GitHub Actions

1. Create a Hugging Face Space at [huggingface.co/new-space](https://huggingface.co/new-space) (select Gradio SDK)
2. Set these GitHub repository secrets:
   - `HF_TOKEN` — your Hugging Face [access token](https://huggingface.co/settings/tokens)
   - `HF_USERNAME` — your Hugging Face username
   - `HF_SPACE_NAME` — your space name
3. Push to `main`. GitHub Actions will sync the repo to your HF Space automatically.

### Manual Push

```bash
# Clone your HF Space repo
git clone https://huggingface.co/spaces/YOUR_USERNAME/space-debris-localizer
cd space-debris-localizer
# Copy project files
cp -r /path/to/space-debris-localizer/* .
git add . && git commit -m "deploy" && git push
```

## Example Prompts

- `Locate all the instances that match the following description: space debris.`
- `Locate all the instances that match the following description: solar panel.`
- `Locate a single instance that matches the following description: spacecraft.`
- `Locate all the instances that match the following description: antenna.`
- `Locate all the instances that match the following description: rocket body.`
- `Locate all the instances that match the following description: thermal blanket.`

## Known Limitations

- **Domain gap:** The model was trained on general grounding data (COCO, LVIS, RefCOCO, etc.), not specifically on orbital imagery. Performance on space scenes is exploratory.
- **Small debris:** Objects below a few pixels are unlikely to be grounded reliably.
- **Image quality:** Detection depends heavily on image resolution and contrast.
- **No confidence calibration:** The model does not output calibrated confidence scores; displayed confidence is a placeholder.
- **GPU required:** CPU inference is extremely slow due to the 3B parameter size.

## Future Work

- Fine-tune on orbital debris datasets (e.g., ESA's DISCOS, ESA Clean Space imagery)
- Integrate with real satellite imagery APIs (e.g., ESA Copernicus, Planet Labs)
- Add temporal tracking across image sequences
- Support video input for debris tracking
- Add point-based localization for centroid estimation
- Deploy with quantized model for faster CPU inference

## Tech Stack

- **Model:** [nvidia/LocateAnything-3B](https://huggingface.co/nvidia/LocateAnything-3B)
- **Framework:** Gradio 5.x, Hugging Face Transformers
- **Language:** Python 3.10+
- **CI/CD:** GitHub Actions
- **Deployment:** Hugging Face Spaces

## License

MIT License. The underlying LocateAnything-3B model is subject to the [NVIDIA License](https://huggingface.co/nvidia/LocateAnything-3B/blob/main/LICENSE) (non-commercial research use).