license: apache-2.0
pipeline_tag: object-detection
tags:
- model_hub_mixin
- pytorch_model_hub_mixin
- vision
- vit
- edge-ai
EdgeCrafter: ECDet-S
EdgeCrafter is a unified compact Vision Transformer (ViT) framework for edge dense prediction, introduced in the paper EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation.
ECDet-S is an object detection model within this framework, featuring a distilled compact backbone and an edge-friendly encoder-decoder design. On the COCO dataset, it achieves 51.7 AP with fewer than 10M parameters using only COCO annotations.
- Paper: arXiv:2603.18739
- GitHub Repository: Intellindust-AI-Lab/EdgeCrafter
- Project Page: EdgeCrafter Project Page
Model Description
Deploying high-performance dense prediction models on resource-constrained edge devices is challenging. EdgeCrafter addresses this by using task-specialized distillation to enhance task-specific representation learning in small-scale ViTs. This approach allows compact ViTs to achieve accuracy-efficiency trade-offs competitive with traditional CNN-based architectures like YOLO.
Sample Usage (Inference)
To run inference on a sample image using the provided scripts in the official repository:
# 1. Clone the repository and install dependencies
git clone https://github.com/Intellindust-AI-Lab/EdgeCrafter
cd EdgeCrafter
pip install -r requirements.txt
# 2. Run PyTorch inference
cd ecdetseg
# Replace `path/to/your/image.jpg` with an actual image path
python tools/inference/torch_inf.py -c configs/ecdet/ecdet_s.yml -r ecdet_s.pth -i path/to/your/image.jpg
This model was pushed to the Hub using the PytorchModelHubMixin integration.
Citation
@article{liu2026edgecrafter,
title={EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation},
author={Liu, Longfei and Hou, Yongjie and Li, Yang and Wang, Qirui and Sha, Youyang and Yu, Yongjun and Wang, Yinzhi and Ru, Peizhe and Yu, Xuanlong and Shen, Xi},
journal={arXiv},
year={2026}
}