Spaces:
Running
Running
| title: SpaceDebris Localizer | |
| emoji: π°οΈ | |
| colorFrom: blue | |
| colorTo: gray | |
| sdk: gradio | |
| sdk_version: 5.12.0 | |
| app_file: app.py | |
| pinned: true | |
| license: mit | |
| # SpaceDebris Localizer | |
| Use **NVIDIA LocateAnything-3B** to locate space debris, satellite fragments, and spacecraft components in orbital imagery. | |
| Orbital debris is a growing threat to satellite operations and crewed spaceflight. This project demonstrates how state-of-the-art vision-language grounding models can be applied to identify and localize objects in space imagery β from satellite solar panels and antennas to rocket bodies and debris fields. Built as a Hugging Face Spaces application, it provides a natural-language interface: describe what you're looking for, and the model draws bounding boxes around matching objects in the image. | |
| ## Why This Matters | |
| There are over 36,000 tracked objects in Earth orbit, and millions of smaller fragments too tiny to track. Traditional detection pipelines require specialized training data and domain-specific models. Vision-language grounding models like LocateAnything-3B offer a different approach: describe the target in natural language and let the model find it. This prototype explores whether general-purpose visual grounding can serve as a rapid-deployment tool for orbital debris awareness, satellite inspection, and space situational awareness workflows. | |
| ## Architecture | |
| ``` | |
| User uploads image + text prompt | |
| β | |
| βΌ | |
| βββββββββββββββββββββββ | |
| β Gradio Interface β | |
| β (app.py) β | |
| ββββββββββ¬βββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββββββ | |
| β LocateAnythingWorkerβ | |
| β (src/inference.py) β | |
| β ββββββββββββββββββββ | |
| β β nvidia/ ββ | |
| β β LocateAnything- ββ | |
| β β 3B (3B params) ββ | |
| β ββββββββββββββββββββ | |
| ββββββββββ¬βββββββββββββ | |
| β raw text with <box> tokens | |
| βΌ | |
| βββββββββββββββββββββββ | |
| β Output Parser β | |
| β (src/parsing.py) β | |
| β Regex β BBox list β | |
| ββββββββββ¬βββββββββββββ | |
| β structured BBox objects | |
| βΌ | |
| βββββββββββββββββββββββ | |
| β Visualizer β | |
| β (src/visualization) β | |
| β Draw boxes + labels β | |
| ββββββββββ¬βββββββββββββ | |
| β | |
| βΌ | |
| Annotated image + JSON metadata | |
| ``` | |
| ## Setup | |
| ### Prerequisites | |
| - Python 3.10+ | |
| - CUDA-capable GPU (recommended) or CPU (slow) | |
| - ~8GB GPU memory for bfloat16 inference | |
| ### Local Installation | |
| ```bash | |
| git clone https://github.com/YOUR_USERNAME/space-debris-localizer.git | |
| cd space-debris-localizer | |
| pip install -e ".[dev]" | |
| ``` | |
| ### Run Locally | |
| ```bash | |
| python app.py | |
| ``` | |
| The app launches at `http://localhost:7860`. First run downloads the model (~6GB). | |
| ### Environment Variables | |
| | Variable | Default | Description | | |
| |----------|---------|-------------| | |
| | `MODEL_ID` | `nvidia/LocateAnything-3B` | HuggingFace model ID | | |
| | `DEVICE` | `cuda` | Device (`cuda` or `cpu`) | | |
| | `DTYPE` | `bfloat16` | Model precision | | |
| | `MAX_NEW_TOKENS` | `8192` | Max generation tokens | | |
| | `GENERATION_MODE` | `hybrid` | `fast`, `slow`, or `hybrid` | | |
| | `PORT` | `7860` | Gradio server port | | |
| ## Deployment to Hugging Face Spaces | |
| ### Automatic Sync via GitHub Actions | |
| 1. Create a Hugging Face Space at [huggingface.co/new-space](https://huggingface.co/new-space) (select Gradio SDK) | |
| 2. Set these GitHub repository secrets: | |
| - `HF_TOKEN` β your Hugging Face [access token](https://huggingface.co/settings/tokens) | |
| - `HF_USERNAME` β your Hugging Face username | |
| - `HF_SPACE_NAME` β your space name | |
| 3. Push to `main`. GitHub Actions will sync the repo to your HF Space automatically. | |
| ### Manual Push | |
| ```bash | |
| # Clone your HF Space repo | |
| git clone https://huggingface.co/spaces/YOUR_USERNAME/space-debris-localizer | |
| cd space-debris-localizer | |
| # Copy project files | |
| cp -r /path/to/space-debris-localizer/* . | |
| git add . && git commit -m "deploy" && git push | |
| ``` | |
| ## Example Prompts | |
| - `Locate all the instances that match the following description: space debris.` | |
| - `Locate all the instances that match the following description: solar panel.` | |
| - `Locate a single instance that matches the following description: spacecraft.` | |
| - `Locate all the instances that match the following description: antenna.` | |
| - `Locate all the instances that match the following description: rocket body.` | |
| - `Locate all the instances that match the following description: thermal blanket.` | |
| ## Known Limitations | |
| - **Domain gap:** The model was trained on general grounding data (COCO, LVIS, RefCOCO, etc.), not specifically on orbital imagery. Performance on space scenes is exploratory. | |
| - **Small debris:** Objects below a few pixels are unlikely to be grounded reliably. | |
| - **Image quality:** Detection depends heavily on image resolution and contrast. | |
| - **No confidence calibration:** The model does not output calibrated confidence scores; displayed confidence is a placeholder. | |
| - **GPU required:** CPU inference is extremely slow due to the 3B parameter size. | |
| ## Future Work | |
| - Fine-tune on orbital debris datasets (e.g., ESA's DISCOS, ESA Clean Space imagery) | |
| - Integrate with real satellite imagery APIs (e.g., ESA Copernicus, Planet Labs) | |
| - Add temporal tracking across image sequences | |
| - Support video input for debris tracking | |
| - Add point-based localization for centroid estimation | |
| - Deploy with quantized model for faster CPU inference | |
| ## Tech Stack | |
| - **Model:** [nvidia/LocateAnything-3B](https://huggingface.co/nvidia/LocateAnything-3B) | |
| - **Framework:** Gradio 5.x, Hugging Face Transformers | |
| - **Language:** Python 3.10+ | |
| - **CI/CD:** GitHub Actions | |
| - **Deployment:** Hugging Face Spaces | |
| ## License | |
| MIT License. The underlying LocateAnything-3B model is subject to the [NVIDIA License](https://huggingface.co/nvidia/LocateAnything-3B/blob/main/LICENSE) (non-commercial research use). | |