File size: 870 Bytes
c53b4a1 0679576 c53b4a1 0679576 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | ---
license: cc-by-nc-sa-4.0
tags:
- arxiv:2606.14024
---
# ViT-Up
**ViT-Up** is an implicit feature upsampler for Vision Transformers that predicts backbone-aligned features at arbitrary continuous image coordinates.
This repository provides pretrained ViT-Up weights for DINOv3-S+ and DINOv3-B.
- Paper: https://arxiv.org/abs/2606.14024
- HF Paper page: https://huggingface.co/papers/2606.14024
- Project page: https://vitup.papers.discuna.com/
- Code: https://github.com/krispinwandel/vit-up
## Citation
```bibtex
@misc{wandel2026vitupfaithfulfeatureupsampling,
title={ViT-Up: Faithful Feature Upsampling for Vision Transformers},
author={Krispin Wandel and Jingchuan Wang and Hesheng Wang},
year={2026},
eprint={2606.14024},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2606.14024},
}
``` |