license: apache-2.0
pipeline_tag: image-segmentation
tags:
- model_hub_mixin
- pytorch_model_hub_mixin
EdgeCrafter: ECSeg-L
EdgeCrafter is a unified compact Vision Transformer (ViT) framework designed for efficient edge dense prediction. This specific model, ECSeg-L, is optimized for instance segmentation on resource-constrained devices. It is part of the work presented in EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation.
- Paper: EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation
- Repository: https://github.com/Intellindust-AI-Lab/EdgeCrafter
- Project Page: https://intellindust-ai-lab.github.io/projects/EdgeCrafter/
Model Description
EdgeCrafter addresses the performance gap between compact ViTs and CNN-based architectures like YOLO on edge devices. By using task-specialized distillation and an edge-friendly encoder-decoder design, EdgeCrafter models achieve a strong accuracy-efficiency tradeoff. ECSeg-L provides a high-performance balance for instance segmentation tasks.
Usage
To use this model, please refer to the official GitHub repository for installation instructions. You can run inference using the following command:
cd ecdetseg
# Run PyTorch inference
# Make sure to replace `path/to/your/image.jpg` with an actual image path and provide the path to the weights
python tools/inference/torch_inf.py -c configs/ecseg/ecseg_l.yml -r /path/to/ecseg_l.pth -i path/to/your/image.jpg
For loading models directly via the Hugging Face Hub, check the hf_models.ipynb notebook in the repository.
Citation
@article{liu2026edgecrafter,
title={EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation},
author={Liu, Longfei and Hou, Yongjie and Li, Yang and Wang, Qirui and Sha, Youyang and Yu, Yongjun and Wang, Yinzhi and Ru, Peizhe and Yu, Xuanlong and Shen, Xi},
journal={arXiv},
year={2026}
}