--- license: apache-2.0 tags: - vit --- # Vision Transformer (base-sized model) Random weights are provided for the ViT model. During each step, the model selects a random subset of the masked image patches.