File size: 6,262 Bytes
f3c81cc
 
 
 
 
 
 
 
 
 
 
 
23db765
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
---
title: SpaceDebris Localizer
emoji: πŸ›°οΈ
colorFrom: blue
colorTo: gray
sdk: gradio
sdk_version: 5.12.0
app_file: app.py
pinned: true
license: mit
---

# SpaceDebris Localizer

Use **NVIDIA LocateAnything-3B** to locate space debris, satellite fragments, and spacecraft components in orbital imagery.

Orbital debris is a growing threat to satellite operations and crewed spaceflight. This project demonstrates how state-of-the-art vision-language grounding models can be applied to identify and localize objects in space imagery β€” from satellite solar panels and antennas to rocket bodies and debris fields. Built as a Hugging Face Spaces application, it provides a natural-language interface: describe what you're looking for, and the model draws bounding boxes around matching objects in the image.

## Why This Matters

There are over 36,000 tracked objects in Earth orbit, and millions of smaller fragments too tiny to track. Traditional detection pipelines require specialized training data and domain-specific models. Vision-language grounding models like LocateAnything-3B offer a different approach: describe the target in natural language and let the model find it. This prototype explores whether general-purpose visual grounding can serve as a rapid-deployment tool for orbital debris awareness, satellite inspection, and space situational awareness workflows.

## Architecture

```
User uploads image + text prompt
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Gradio Interface   β”‚
β”‚   (app.py)           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  LocateAnythingWorkerβ”‚
β”‚  (src/inference.py)  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚ nvidia/          β”‚β”‚
β”‚  β”‚ LocateAnything-  β”‚β”‚
β”‚  β”‚ 3B (3B params)   β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ raw text with <box> tokens
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Output Parser       β”‚
β”‚  (src/parsing.py)    β”‚
β”‚  Regex β†’ BBox list   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ structured BBox objects
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Visualizer          β”‚
β”‚  (src/visualization) β”‚
β”‚  Draw boxes + labels β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
   Annotated image + JSON metadata
```

## Setup

### Prerequisites

- Python 3.10+
- CUDA-capable GPU (recommended) or CPU (slow)
- ~8GB GPU memory for bfloat16 inference

### Local Installation

```bash
git clone https://github.com/YOUR_USERNAME/space-debris-localizer.git
cd space-debris-localizer
pip install -e ".[dev]"
```

### Run Locally

```bash
python app.py
```

The app launches at `http://localhost:7860`. First run downloads the model (~6GB).

### Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `MODEL_ID` | `nvidia/LocateAnything-3B` | HuggingFace model ID |
| `DEVICE` | `cuda` | Device (`cuda` or `cpu`) |
| `DTYPE` | `bfloat16` | Model precision |
| `MAX_NEW_TOKENS` | `8192` | Max generation tokens |
| `GENERATION_MODE` | `hybrid` | `fast`, `slow`, or `hybrid` |
| `PORT` | `7860` | Gradio server port |

## Deployment to Hugging Face Spaces

### Automatic Sync via GitHub Actions

1. Create a Hugging Face Space at [huggingface.co/new-space](https://huggingface.co/new-space) (select Gradio SDK)
2. Set these GitHub repository secrets:
   - `HF_TOKEN` β€” your Hugging Face [access token](https://huggingface.co/settings/tokens)
   - `HF_USERNAME` β€” your Hugging Face username
   - `HF_SPACE_NAME` β€” your space name
3. Push to `main`. GitHub Actions will sync the repo to your HF Space automatically.

### Manual Push

```bash
# Clone your HF Space repo
git clone https://huggingface.co/spaces/YOUR_USERNAME/space-debris-localizer
cd space-debris-localizer
# Copy project files
cp -r /path/to/space-debris-localizer/* .
git add . && git commit -m "deploy" && git push
```

## Example Prompts

- `Locate all the instances that match the following description: space debris.`
- `Locate all the instances that match the following description: solar panel.`
- `Locate a single instance that matches the following description: spacecraft.`
- `Locate all the instances that match the following description: antenna.`
- `Locate all the instances that match the following description: rocket body.`
- `Locate all the instances that match the following description: thermal blanket.`

## Known Limitations

- **Domain gap:** The model was trained on general grounding data (COCO, LVIS, RefCOCO, etc.), not specifically on orbital imagery. Performance on space scenes is exploratory.
- **Small debris:** Objects below a few pixels are unlikely to be grounded reliably.
- **Image quality:** Detection depends heavily on image resolution and contrast.
- **No confidence calibration:** The model does not output calibrated confidence scores; displayed confidence is a placeholder.
- **GPU required:** CPU inference is extremely slow due to the 3B parameter size.

## Future Work

- Fine-tune on orbital debris datasets (e.g., ESA's DISCOS, ESA Clean Space imagery)
- Integrate with real satellite imagery APIs (e.g., ESA Copernicus, Planet Labs)
- Add temporal tracking across image sequences
- Support video input for debris tracking
- Add point-based localization for centroid estimation
- Deploy with quantized model for faster CPU inference

## Tech Stack

- **Model:** [nvidia/LocateAnything-3B](https://huggingface.co/nvidia/LocateAnything-3B)
- **Framework:** Gradio 5.x, Hugging Face Transformers
- **Language:** Python 3.10+
- **CI/CD:** GitHub Actions
- **Deployment:** Hugging Face Spaces

## License

MIT License. The underlying LocateAnything-3B model is subject to the [NVIDIA License](https://huggingface.co/nvidia/LocateAnything-3B/blob/main/LICENSE) (non-commercial research use).