Update README: Add model card metadata, ImageNet-1k metrics, and LiteRT usage example

#1
Files changed (1) hide show
  1. README.md +119 -1
README.md CHANGED
@@ -1,8 +1,126 @@
1
  ---
2
  library_name: litert
 
3
  tags:
4
  - vision
5
  - image-classification
 
 
6
  datasets:
7
  - imagenet-1k
8
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  library_name: litert
3
+ pipeline_tag: image-classification
4
  tags:
5
  - vision
6
  - image-classification
7
+ - google
8
+ - computer-vision
9
  datasets:
10
  - imagenet-1k
11
+ model-index:
12
+ - name: litert-community/resnet152
13
+ results:
14
+ - task:
15
+ type: image-classification
16
+ name: Image Classification
17
+ dataset:
18
+ name: ImageNet-1k
19
+ type: imagenet-1k
20
+ config: default
21
+ split: validation
22
+ metrics:
23
+ - name: Top 1 Accuracy (Full Precision)
24
+ type: accuracy
25
+ value: 0.7822
26
+ - name: Top 5 Accuracy (Full Precision)
27
+ type: accuracy
28
+ value: 0.9411
29
+ - name: Top 1 Accuracy (Dynamic Quantized wi8 afp32)
30
+ type: accuracy
31
+ value: 0.7814
32
+ - name: Top 5 Accuracy (Dynamic Quantized wi8 afp32)
33
+ type: accuracy
34
+ value: 0.9410
35
+ ---
36
+
37
+ # ResNet 152
38
+
39
+ The ResNet-152 architecture is a convolutional neural network pre-trained on the ImageNet-1k dataset. Originally introduced by He et al. in the landmark paper, [**Deep Residual Learning for Image Recognition**](https://arxiv.org/pdf/1512.03385), this model utilizes residual mapping to overcome the vanishing gradient problem, enabling the training of substantially deeper networks.
40
+
41
+
42
+ ## Model description
43
+
44
+ The model was converted from a checkpoint from PyTorch Vision.
45
+
46
+ The original model has:
47
+ acc@1 (on ImageNet-1K): 82.284%
48
+ acc@5 (on ImageNet-1K): 96.002%
49
+ num_params: 60,192,808
50
+
51
+
52
+ ## Intended uses & limitations
53
+
54
+ The model files were converted from pretrained weights from PyTorch Vision. The models may have their own licenses or terms and conditions derived from PyTorch Vision and the dataset used for training. It is your responsibility to determine whether you have permission to use the models for your use case.
55
+
56
+ ## Use
57
+
58
+ ```python
59
+ #!/usr/bin/env python3
60
+ import argparse, json
61
+ import numpy as np
62
+ from PIL import Image
63
+ from huggingface_hub import hf_hub_download
64
+ from ai_edge_litert.compiled_model import CompiledModel
65
+
66
+ def preprocess(img: Image.Image) -> np.ndarray:
67
+ img = img.convert("RGB")
68
+ w, h = img.size
69
+ s = 232
70
+ if w < h:
71
+ img = img.resize((s, int(round(h * s / w))), Image.BILINEAR)
72
+ else:
73
+ img = img.resize((int(round(w * s / h)), s), Image.BILINEAR)
74
+ left = (img.size[0] - 224) // 2
75
+ top = (img.size[1] - 224) // 2
76
+ img = img.crop((left, top, left + 224, top + 224))
77
+
78
+ x = np.asarray(img, dtype=np.float32) / 255.0
79
+ x = (x - np.array([0.485, 0.456, 0.406], dtype=np.float32)) / np.array(
80
+ [0.229, 0.224, 0.225], dtype=np.float32
81
+ )
82
+ return np.transpose(x, (2, 0, 1))
83
+
84
+ def main():
85
+ ap = argparse.ArgumentParser()
86
+ ap.add_argument("--image", required=True)
87
+ args = ap.parse_args()
88
+
89
+ model_path = hf_hub_download("litert-community/resnet152", "resnet152.tflite")
90
+ labels_path = hf_hub_download(
91
+ "huggingface/label-files", "imagenet-1k-id2label.json", repo_type="dataset"
92
+ )
93
+ with open(labels_path, "r", encoding="utf-8") as f:
94
+ id2label = {int(k): v for k, v in json.load(f).items()}
95
+
96
+ img = Image.open(args.image)
97
+ x = preprocess(img)
98
+
99
+ model = CompiledModel.from_file(model_path)
100
+ inp = model.create_input_buffers(0)
101
+ out = model.create_output_buffers(0)
102
+
103
+ inp[0].write(x)
104
+ model.run_by_index(0, inp, out)
105
+
106
+ req = model.get_output_buffer_requirements(0, 0)
107
+ y = out[0].read(req["buffer_size"] // np.dtype(np.float32).itemsize, np.float32)
108
+
109
+ pred = int(np.argmax(y))
110
+ label = id2label.get(pred, f"class_{pred}")
111
+
112
+ print(f"Top-1 class index: {pred}")
113
+ print(f"Top-1 label: {label}")
114
+ if __name__ == "__main__":
115
+ main()
116
+ ```
117
+ ### BibTeX entry and citation info
118
+
119
+ ```bibtex
120
+ @inproceedings{he2016deep,
121
+ title={Deep residual learning for image recognition},
122
+ author={He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian},
123
+ booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition}, pages={770--778},
124
+ year={2016}
125
+ }
126
+ ```