Divyasreepat commited on
Commit
b45bf7c
·
verified ·
1 Parent(s): 609d98e

Update README.md with new model card content

Browse files
Files changed (1) hide show
  1. README.md +90 -0
README.md CHANGED
@@ -2,5 +2,95 @@
2
  library_name: keras-hub
3
  ---
4
  ### Model Overview
 
5
 
 
6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  library_name: keras-hub
3
  ---
4
  ### Model Overview
5
+ # Model Summary
6
 
7
+ Vision Transformer (ViT) adapts the Transformer architecture, originally designed for natural language processing, to the domain of computer vision. It treats images as sequences of patches, similar to how Transformers treat sentences as sequences of words.. It was introduced in the paper [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929).
8
 
9
+ ## Links:
10
+
11
+ * [Vit Quickstart Notebook](https://www.kaggle.com/code/sineeli/vit-quickstart)
12
+ * [Vit API Documentation](coming soon)
13
+ * [Vit Model Card](https://huggingface.co/google/vit-base-patch16-224)
14
+ * [KerasHub Beginner Guide](https://keras.io/guides/keras_hub/getting_started/)
15
+ * [KerasHub Model Publishing Guide](https://keras.io/guides/keras_hub/upload/)
16
+
17
+ ## Installation
18
+
19
+ Keras and KerasHub can be installed with:
20
+
21
+ ```
22
+ pip install -U -q keras-hub
23
+ pip install -U -q keras
24
+ ```
25
+
26
+ ## Presets
27
+
28
+ Model ID | img_size |Acc | Top-5 | Parameters |
29
+ :--: |:--:|:--:|:--:|:--:|
30
+ **Base**|  
31
+ vit_base_patch16_224_imagenet |224|-|-|85798656|
32
+ vit_base_patch_16_224_imagenet21k|224|-|-|85798656|
33
+ vit_base_patch_16_384_imagenet|384|-|-|86090496|
34
+ vit_base_patch32_224_imagenet21k|224|-|-|87455232|
35
+ vit_base_patch32_384_imagenet|384|-|-|87528192|
36
+ **Large**|
37
+ vit_large_patch16_224_imagenet|224|-|-|303301632|
38
+ vit_large_patch16_224_imagenet21k|224|-|-|303301632|
39
+ vit_large_patch16_384_imagenet|224|-|-|303690752|
40
+ vit_large_patch32_224_imagenet21k|224|-|-|305510400|
41
+ vit_large_patch32_384_imagenet|224|-|-|305607680|
42
+ **Huge**|
43
+ vit_huge_patch14_224_imagenet21k|224|-|-|630764800|
44
+
45
+ ## Example Usage
46
+ ## Pretrained ViT model
47
+ ```
48
+ image_classifier = keras_hub.models.ImageClassification.from_preset(
49
+ "vit_base_patch16_384_imagenet"
50
+ )
51
+
52
+ input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3))
53
+ image_classifier(input_data)
54
+ ```
55
+
56
+ ## Load the backbone weights and fine-tune model for custom dataset.
57
+ ```python3
58
+ backbone = keras_hub.models.Backbone.from_preset(
59
+ "vit_base_patch16_384_imagenet"
60
+ )
61
+ preprocessor = keras_hub.models.ViTImageClassifierPreprocessor.from_preset(
62
+ "vit_base_patch16_384_imagenet"
63
+ )
64
+ model = keras_hub.models.ViTImageClassifier(
65
+ backbone=backbone,
66
+ num_classes=len(CLASSES),
67
+ preprocessor=preprocessor,
68
+ )
69
+ ```
70
+
71
+ ## Example Usage with Hugging Face URI
72
+
73
+ ## Pretrained ViT model
74
+ ```
75
+ image_classifier = keras_hub.models.ImageClassification.from_preset(
76
+ "hf://keras/vit_base_patch16_384_imagenet"
77
+ )
78
+
79
+ input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3))
80
+ image_classifier(input_data)
81
+ ```
82
+
83
+ ## Load the backbone weights and fine-tune model for custom dataset.
84
+ ```python3
85
+ backbone = keras_hub.models.Backbone.from_preset(
86
+ "hf://keras/vit_base_patch16_384_imagenet"
87
+ )
88
+ preprocessor = keras_hub.models.ViTImageClassifierPreprocessor.from_preset(
89
+ "hf://keras/vit_base_patch16_384_imagenet"
90
+ )
91
+ model = keras_hub.models.ViTImageClassifier(
92
+ backbone=backbone,
93
+ num_classes=len(CLASSES),
94
+ preprocessor=preprocessor,
95
+ )
96
+ ```