Image Segmentation
Transformers
PyTorch
pixdlm
cvpr-2026
compute-transparency
reasoning-segmentation
uav
remote-sensing
vision-language
Instructions to use WhynotHug/PixDLM with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use WhynotHug/PixDLM with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-segmentation", model="WhynotHug/PixDLM")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("WhynotHug/PixDLM", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| # DRSeg Data | |
| DRSeg is a UAV reasoning segmentation benchmark with 10,000 image-level samples. | |
| Each sample contains a UAV image, one or more segmentation annotations, a | |
| reasoning question, a reasoning answer, and a reasoning type. | |
| ## Splits | |
| | Split | Samples | | |
| | --- | ---: | | |
| | Train | 2,999 | | |
| | Validation | 2,000 | | |
| | Test | 5,001 | | |
| ## Expected Layout | |
| ```text | |
| data/DRSeg/ | |
| βββ DRtrain/ | |
| βββ DRval/ | |
| βββ DRtest/ | |
| βββ label/ | |
| β βββ DRSeg_train.json | |
| β βββ DRSeg_val.json | |
| β βββ DRSeg_test.json | |
| βββ CODrone -> . | |
| βββ labels -> label | |
| ``` | |
| The `CODrone` and `labels` entries are compatibility links for the original | |
| dataset loader. | |
| ## Reasoning Types | |
| - `spatial`: position and spatial relation reasoning. | |
| - `attribute`: visual attribute and object property reasoning. | |
| - `scene`: scene-context reasoning. | |
| ## Metadata Preview | |
| The HuggingFace dataset repo includes lightweight metadata JSONL files under | |
| `metadata/`. They are intended for dataset-card preview and quick inspection. | |
| Use the full image/mask archive for training and evaluation. | |