Can't reproduce outliers (patch embeddings of very high norm)
#7
by
niktheod
- opened
Hi. I was trying to reproduce the behaviour mentioned in the paper "Vision Transformers need registers" but I never got a single outlier. I encoded 1000 different images with this model and the norm of all the patch embeddings in the output was constantly < 60. Is this an improved version of the model that has resolved this issue?
Hi,
This model corresponds to the original DINOv2 large model released here: https://github.com/facebookresearch/dinov2?tab=readme-ov-file#pretrained-models. The conversion script can be found here: https://github.com/huggingface/transformers/blob/main/src/transformers/models/dinov2/convert_dinov2_to_hf.py.