Update README: Add model card metadata, ImageNet-1k metrics, and LiteRT usage example

#2
Files changed (1) hide show
  1. README.md +168 -0
README.md CHANGED
@@ -1,8 +1,176 @@
1
  ---
2
  library_name: litert
 
3
  tags:
4
  - vision
5
  - image-classification
 
 
6
  datasets:
7
  - imagenet-1k
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  library_name: litert
3
+ pipeline_tag: image-classification
4
  tags:
5
  - vision
6
  - image-classification
7
+ - google
8
+ - computer-vision
9
  datasets:
10
  - imagenet-1k
11
+ model-index:
12
+ - name: litert-community/squeezenet1_1
13
+ results:
14
+ - task:
15
+ type: image-classification
16
+ name: Image Classification
17
+ dataset:
18
+ name: ImageNet-1k
19
+ type: imagenet-1k
20
+ config: default
21
+ split: validation
22
+ metrics:
23
+ - name: Top 1 Accuracy (Full Precision)
24
+ type: accuracy
25
+ value: 0.5819
26
+ - name: Top 5 Accuracy (Full Precision)
27
+ type: accuracy
28
+ value: 0.8059
29
+ - name: Top 1 Accuracy (Dynamic Quantized wi8 afp32)
30
+ type: accuracy
31
+ value: 0.5809
32
+ - name: Top 5 Accuracy (Dynamic Quantized wi8 afp32)
33
+ type: accuracy
34
+ value: 0.8053
35
  ---
36
+
37
+ # Squeezenet1_1
38
+
39
+ SqueezeNet 1.1 is a highly efficient model pre-trained on the ImageNet-1k dataset at a 224x224 resolution. Detailed in the paper[ "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size"](https://arxiv.org/abs/1602.07360) and released via its [official repository](https://github.com/forresti/SqueezeNet/tree/master/SqueezeNet_v1.1), this updated version improves upon SqueezeNet 1.0. It reduces computational costs by 2.4x (operating at just 0.35 GFLOPS) and uses slightly fewer parameters, all while maintaining the exact same level of accuracy.
40
+
41
+ ## Model description
42
+
43
+ The model was converted from a checkpoint from PyTorch Vision (`SqueezeNet1_1_Weights.IMAGENET1K_V1`).
44
+
45
+ The original model has:
46
+ acc@1 (on ImageNet-1K): 58.178%
47
+ acc@5 (on ImageNet-1K): 80.624%
48
+ num_params: 1,235,496
49
+
50
+ This model is released under the BSD 3-Clause License, inheriting the license of the `torchvision` repository from which it was converted.
51
+
52
+
53
+ ## Intended uses & limitations
54
+
55
+ The model files were converted from pretrained weights from PyTorch Vision. The models may have their own licenses or terms and conditions derived from PyTorch Vision and the dataset used for training. It is your responsibility to determine whether you have permission to use the models for your use case.
56
+
57
+ ## How to Use
58
+
59
+ ​​**1. Install Dependencies**
60
+
61
+ Ensure your Python environment is set up with the required libraries. Run the following command in your terminal
62
+
63
+ ```bash
64
+ pip install numpy Pillow huggingface_hub ai-edge-litert
65
+ ```
66
+
67
+ **2. Prepare Your Image**
68
+
69
+ The script expects an image file to analyze. Make sure you have an image (e.g., cat.jpg or car.png) saved in the same working directory as your script.
70
+
71
+
72
+ **3. Save the Script**
73
+
74
+ Create a new file named `classify.py`, paste the script below into it, and save the file:
75
+
76
+ ```python
77
+ #!/usr/bin/env python3
78
+ import os
79
+ import argparse
80
+ import json
81
+ import numpy as np
82
+ from PIL import Image
83
+ from huggingface_hub import hf_hub_download
84
+ from ai_edge_litert.compiled_model import CompiledModel
85
+
86
+ def preprocess(img: Image.Image) -> np.ndarray:
87
+ img = img.convert("RGB")
88
+ w, h = img.size
89
+
90
+ # Resize shortest edge to 256
91
+ s = 256
92
+ if w < h:
93
+ img = img.resize((s, int(round(h * s / w))), Image.BILINEAR)
94
+ else:
95
+ img = img.resize((int(round(w * s / h)), s), Image.BILINEAR)
96
+
97
+ # Central crop to 224x224
98
+ left = (img.size[0] - 224) // 2
99
+ top = (img.size[1] - 224) // 2
100
+ img = img.crop((left, top, left + 224, top + 224))
101
+
102
+ # Rescale to [0.0, 1.0] and Normalize
103
+ x = np.asarray(img, dtype=np.float32) / 255.0
104
+ x = (x - np.array([0.485, 0.456, 0.406], dtype=np.float32)) / np.array(
105
+ [0.229, 0.224, 0.225], dtype=np.float32
106
+ )
107
+
108
+ # Transpose from HWC (224, 224, 3) to CHW (3, 224, 224)
109
+ x = np.transpose(x, (2, 0, 1))
110
+
111
+ # Add the batch dimension to create NCHW (1, 3, 224, 224)
112
+ x = np.expand_dims(x, axis=0)
113
+
114
+ # The C++ buffer reads the bytes in the correct NCHW order.
115
+ x = np.ascontiguousarray(x, dtype=np.float32)
116
+
117
+ return x
118
+
119
+ def main():
120
+ ap = argparse.ArgumentParser()
121
+ ap.add_argument("--image", required=True, help="Path to the input image")
122
+ args = ap.parse_args()
123
+
124
+ # Download the TFLite model and labels for squeezenet1_1
125
+ model_path = hf_hub_download("litert-community/squeezenet1_1", "squeezenet1_1.tflite")
126
+ labels_path = hf_hub_download(
127
+ "huggingface/label-files", "imagenet-1k-id2label.json", repo_type="dataset"
128
+ )
129
+
130
+ with open(labels_path, "r", encoding="utf-8") as f:
131
+ id2label = {int(k): v for k, v in json.load(f).items()}
132
+
133
+ img = Image.open(args.image)
134
+ x = preprocess(img)
135
+
136
+ model = CompiledModel.from_file(model_path)
137
+ inp = model.create_input_buffers(0)
138
+ out = model.create_output_buffers(0)
139
+
140
+ # The 4D tensor is perfectly aligned with the C++ memory expectations
141
+ inp[0].write(x)
142
+ model.run_by_index(0, inp, out)
143
+
144
+ req = model.get_output_buffer_requirements(0, 0)
145
+ y = out[0].read(req["buffer_size"] // np.dtype(np.float32).itemsize, np.float32)
146
+
147
+ pred = int(np.argmax(y))
148
+ label = id2label.get(pred, f"class_{pred}")
149
+
150
+ print(f"Top-1 class index: {pred}")
151
+ print(f"Top-1 label: {label}")
152
+
153
+ if __name__ == "__main__":
154
+ main()
155
+ ```
156
+ **4. Execute the Python Script**
157
+
158
+ Run the below command
159
+
160
+ ```bash
161
+ python classify.py --image cat.jpg
162
+ ```
163
+
164
+ ### BibTeX Entry and Citation Info
165
+
166
+ ```bibtex
167
+ @misc{iandola2016squeezenetalexnetlevelaccuracy50x,
168
+ title={SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size},
169
+ author={Forrest N. Iandola and Song Han and Matthew W. Moskewicz and Khalid Ashraf and William J. Dally and Kurt Keutzer},
170
+ year={2016},
171
+ eprint={1602.07360},
172
+ archivePrefix={arXiv},
173
+ primaryClass={cs.CV},
174
+ url={https://arxiv.org/abs/1602.07360},
175
+ }
176
+ ```