---
license: apache-2.0
tags:
- vit
---

# Vision Transformer (base-sized model)

Random weights are provided for the ViT model. During each step, the model selects a random subset of the masked image patches.