Spaces:
Running
Running
File size: 6,262 Bytes
f3c81cc 23db765 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 | ---
title: SpaceDebris Localizer
emoji: π°οΈ
colorFrom: blue
colorTo: gray
sdk: gradio
sdk_version: 5.12.0
app_file: app.py
pinned: true
license: mit
---
# SpaceDebris Localizer
Use **NVIDIA LocateAnything-3B** to locate space debris, satellite fragments, and spacecraft components in orbital imagery.
Orbital debris is a growing threat to satellite operations and crewed spaceflight. This project demonstrates how state-of-the-art vision-language grounding models can be applied to identify and localize objects in space imagery β from satellite solar panels and antennas to rocket bodies and debris fields. Built as a Hugging Face Spaces application, it provides a natural-language interface: describe what you're looking for, and the model draws bounding boxes around matching objects in the image.
## Why This Matters
There are over 36,000 tracked objects in Earth orbit, and millions of smaller fragments too tiny to track. Traditional detection pipelines require specialized training data and domain-specific models. Vision-language grounding models like LocateAnything-3B offer a different approach: describe the target in natural language and let the model find it. This prototype explores whether general-purpose visual grounding can serve as a rapid-deployment tool for orbital debris awareness, satellite inspection, and space situational awareness workflows.
## Architecture
```
User uploads image + text prompt
β
βΌ
βββββββββββββββββββββββ
β Gradio Interface β
β (app.py) β
ββββββββββ¬βββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β LocateAnythingWorkerβ
β (src/inference.py) β
β ββββββββββββββββββββ
β β nvidia/ ββ
β β LocateAnything- ββ
β β 3B (3B params) ββ
β ββββββββββββββββββββ
ββββββββββ¬βββββββββββββ
β raw text with <box> tokens
βΌ
βββββββββββββββββββββββ
β Output Parser β
β (src/parsing.py) β
β Regex β BBox list β
ββββββββββ¬βββββββββββββ
β structured BBox objects
βΌ
βββββββββββββββββββββββ
β Visualizer β
β (src/visualization) β
β Draw boxes + labels β
ββββββββββ¬βββββββββββββ
β
βΌ
Annotated image + JSON metadata
```
## Setup
### Prerequisites
- Python 3.10+
- CUDA-capable GPU (recommended) or CPU (slow)
- ~8GB GPU memory for bfloat16 inference
### Local Installation
```bash
git clone https://github.com/YOUR_USERNAME/space-debris-localizer.git
cd space-debris-localizer
pip install -e ".[dev]"
```
### Run Locally
```bash
python app.py
```
The app launches at `http://localhost:7860`. First run downloads the model (~6GB).
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `MODEL_ID` | `nvidia/LocateAnything-3B` | HuggingFace model ID |
| `DEVICE` | `cuda` | Device (`cuda` or `cpu`) |
| `DTYPE` | `bfloat16` | Model precision |
| `MAX_NEW_TOKENS` | `8192` | Max generation tokens |
| `GENERATION_MODE` | `hybrid` | `fast`, `slow`, or `hybrid` |
| `PORT` | `7860` | Gradio server port |
## Deployment to Hugging Face Spaces
### Automatic Sync via GitHub Actions
1. Create a Hugging Face Space at [huggingface.co/new-space](https://huggingface.co/new-space) (select Gradio SDK)
2. Set these GitHub repository secrets:
- `HF_TOKEN` β your Hugging Face [access token](https://huggingface.co/settings/tokens)
- `HF_USERNAME` β your Hugging Face username
- `HF_SPACE_NAME` β your space name
3. Push to `main`. GitHub Actions will sync the repo to your HF Space automatically.
### Manual Push
```bash
# Clone your HF Space repo
git clone https://huggingface.co/spaces/YOUR_USERNAME/space-debris-localizer
cd space-debris-localizer
# Copy project files
cp -r /path/to/space-debris-localizer/* .
git add . && git commit -m "deploy" && git push
```
## Example Prompts
- `Locate all the instances that match the following description: space debris.`
- `Locate all the instances that match the following description: solar panel.`
- `Locate a single instance that matches the following description: spacecraft.`
- `Locate all the instances that match the following description: antenna.`
- `Locate all the instances that match the following description: rocket body.`
- `Locate all the instances that match the following description: thermal blanket.`
## Known Limitations
- **Domain gap:** The model was trained on general grounding data (COCO, LVIS, RefCOCO, etc.), not specifically on orbital imagery. Performance on space scenes is exploratory.
- **Small debris:** Objects below a few pixels are unlikely to be grounded reliably.
- **Image quality:** Detection depends heavily on image resolution and contrast.
- **No confidence calibration:** The model does not output calibrated confidence scores; displayed confidence is a placeholder.
- **GPU required:** CPU inference is extremely slow due to the 3B parameter size.
## Future Work
- Fine-tune on orbital debris datasets (e.g., ESA's DISCOS, ESA Clean Space imagery)
- Integrate with real satellite imagery APIs (e.g., ESA Copernicus, Planet Labs)
- Add temporal tracking across image sequences
- Support video input for debris tracking
- Add point-based localization for centroid estimation
- Deploy with quantized model for faster CPU inference
## Tech Stack
- **Model:** [nvidia/LocateAnything-3B](https://huggingface.co/nvidia/LocateAnything-3B)
- **Framework:** Gradio 5.x, Hugging Face Transformers
- **Language:** Python 3.10+
- **CI/CD:** GitHub Actions
- **Deployment:** Hugging Face Spaces
## License
MIT License. The underlying LocateAnything-3B model is subject to the [NVIDIA License](https://huggingface.co/nvidia/LocateAnything-3B/blob/main/LICENSE) (non-commercial research use).
|