MBZUAI
/

swiftformer-s

Image Classification

Model card Files Files and versions

shehan97 commited on Apr 18, 2023

Commit

fe29c5d

·

1 Parent(s): ce24f3e

Upload README.md

Files changed (1) hide show

README.md +56 -0

README.md ADDED Viewed

	@@ -0,0 +1,56 @@

+---
+datasets:
+- imagenet-1k
+library_name: transformers
+pipeline_tag: image-classification
+---
+# SwiftFormer
+## Model description
+The SwiftFormer model was proposed in [SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications](https://arxiv.org/abs/2303.15446) by Abdelrahman Shaker, Muhammad Maaz, Hanoona Rasheed, Salman Khan, Ming-Hsuan Yang, Fahad Shahbaz Khan.
+SwiftFormer paper introduces a novel efficient additive attention mechanism that effectively replaces the quadratic matrix multiplication operations in the self-attention computation with linear element-wise multiplications. A series of models called 'SwiftFormer' is built based on this, which achieves state-of-the-art performance in terms of both accuracy and mobile inference speed. Even their small variant achieves 78.5% top-1 ImageNet1K accuracy with only 0.8 ms latency on iPhone 14, which is more accurate and 2× faster compared to MobileViT-v2.
+## Intended uses & limitations
+## How to use
+    import requests
+    from PIL import Image
+    url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
+    image = Image.open(requests.get(url, stream=True).raw)
+    from transformers import ViTImageProcessor
+    processor = ViTImageProcessor.from_pretrained('shehan97/swiftformer-xs')
+    inputs = processor(images=image, return_tensors="pt")
+    from transformers.models.swiftformer import SwiftFormerForImageClassification
+    new_model = SwiftFormerForImageClassification.from_pretrained('shehan97/swiftformer-xs')
+    output = new_model(inputs['pixel_values'], output_hidden_states=True)
+    logits = output.logits
+    predicted_class_idx = logits.argmax(-1).item()
+    print("Predicted class:", new_model.config.id2label[predicted_class_idx])
+## Limitations and bias
+## Training data
+The classification model is trained on the ImageNet-1K dataset.
+## Training procedure
+## Evaluation results