Instructions to use microsoft/swin-base-patch4-window7-224-in22k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/swin-base-patch4-window7-224-in22k with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-classification", model="microsoft/swin-base-patch4-window7-224-in22k") pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png")# Load model directly from transformers import AutoImageProcessor, AutoModelForImageClassification processor = AutoImageProcessor.from_pretrained("microsoft/swin-base-patch4-window7-224-in22k") model = AutoModelForImageClassification.from_pretrained("microsoft/swin-base-patch4-window7-224-in22k") - Inference
- Notebooks
- Google Colab
- Kaggle
Output attention shape
#2
by Yingshu - opened
I explored the output attentions. I found the output is a tuple containing 4 elements. The size of each element is:
[64,4,49,49]
[16,8,49,49]
[4,16,49,49]
[1,32,49,49]
I know the second dimension, 4, 8, 16, 32 is the num_heads. I want to ask what the first dimension (64, 16, 4, 1) is.