license: apache-2.0
pipeline_tag: image-segmentation
tags:
- model_hub_mixin
- pytorch_model_hub_mixin
EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation
EdgeCrafter is a unified framework for high-performance dense prediction on resource-constrained edge devices. It introduces compact Vision Transformers (ViTs) that compete with CNN-based architectures like YOLO by using task-specialized distillation and edge-aware encoder-decoder designs.
This repository contains a checkpoint for ECSeg, the instance segmentation variant of the framework, which achieves a strong accuracy-efficiency tradeoff.
- Paper: EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation
- Project Page: EdgeCrafter Project Page
- Code: GitHub Repository
Model Description
Deploying high-performance dense prediction models on resource-constrained edge devices remains challenging due to strict limits on computation and memory. EdgeCrafter addresses this by introducing a framework centered on distilled compact backbones and edge-friendly encoder-decoder designs. For instance segmentation, ECSeg achieves performance comparable to RF-DETR while using substantially fewer parameters, proving that compact ViTs can be a practical and competitive option for edge deployment.
Usage
This model is compatible with the PytorchModelHubMixin. For detailed instructions on installation, training, and running inference, please refer to the official GitHub repository.
Citation
If you find this project useful in your research, please consider citing:
@article{liu2026edgecrafter,
title={EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation},
author={Liu, Longfei and Hou, Yongjie and Li, Yang and Wang, Qirui and Sha, Youyang and Yu, Yongjun and Wang, Yinzhi and Ru, Peizhe and Yu, Xuanlong and Shen, Xi},
journal={arXiv},
year={2026}
}