File size: 4,175 Bytes
ddf5589 c684b4f ddf5589 c684b4f ddf5589 c684b4f ddf5589 c684b4f b0366c1 c684b4f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 | ---
library_name: litert
pipeline_tag: image-classification
tags:
- vision
- image-classification
- computer-vision
datasets:
- imagenet-1k
model-index:
- name: inception_v3
results:
- task:
type: image-classification
name: Image Classification
dataset:
name: ImageNet-1k
type: imagenet-1k
config: default
split: validation
metrics:
- name: Top 1 Accuracy (Full Precision)
type: accuracy
value: 0.7727
- name: Top 5 Accuracy (Full Precision)
type: accuracy
value: 0.9343
---
# Inception_v3
Inception v3 model pre-trained on ImageNet-1k. It was introduced in [**Rethinking the Inception Architecture for Computer Vision**](https://arxiv.org/abs/1512.00567) by Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna.
## Intended uses & limitations
The model files were converted from pretrained weights from PyTorch Vision. The models may have their own licenses or terms and conditions derived from PyTorch Vision and the dataset used for training. It is your responsibility to determine whether you have permission to use the models for your use case.
## Model description
The model was converted from a checkpoint from PyTorch Vision [`Inception_V3_Weights.IMAGENET1K_V1`](https://docs.pytorch.org/vision/main/models/generated/torchvision.models.inception_v3.html#torchvision.models.Inception_V3_Weights).
The original model has:
acc@1 (on ImageNet-1K): 77.294%
acc@5 (on ImageNet-1K): 93.450%
num_params: 27,161,264
## Use
```python
#!/usr/bin/env python3
import argparse, json
import numpy as np
from PIL import Image
from huggingface_hub import hf_hub_download
from ai_edge_litert.compiled_model import CompiledModel
def preprocess(img: Image.Image) -> np.ndarray:
img = img.convert("RGB")
w, h = img.size
# Inception_v3 expects a resize to 342 prior to the 299 central crop
s = 342
if w < h:
img = img.resize((s, int(round(h * s / w))), Image.BILINEAR)
else:
img = img.resize((int(round(w * s / h)), s), Image.BILINEAR)
# Central crop to 299x299
left = (img.size[0] - 299) // 2
top = (img.size[1] - 299) // 2
img = img.crop((left, top, left + 299, top + 299))
# Rescale to [0.0, 1.0] and Normalize
x = np.asarray(img, dtype=np.float32) / 255.0
x = (x - np.array([0.485, 0.456, 0.406], dtype=np.float32)) / np.array(
[0.229, 0.224, 0.225], dtype=np.float32
)
# Expand dimensions to create NHWC 4D tensor: (1, 299, 299, 3)
x = np.expand_dims(x, axis=0)
return x
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--image", required=True)
args = ap.parse_args()
# Download the TFLite model and labels
model_path = hf_hub_download("litert-community/inception_v3", "inception_v3.tflite")
labels_path = hf_hub_download(
"huggingface/label-files", "imagenet-1k-id2label.json", repo_type="dataset"
)
with open(labels_path, "r", encoding="utf-8") as f:
id2label = {int(k): v for k, v in json.load(f).items()}
img = Image.open(args.image)
x = preprocess(img)
model = CompiledModel.from_file(model_path)
inp = model.create_input_buffers(0)
out = model.create_output_buffers(0)
inp[0].write(x)
model.run_by_index(0, inp, out)
req = model.get_output_buffer_requirements(0, 0)
y = out[0].read(req["buffer_size"] // np.dtype(np.float32).itemsize, np.float32)
pred = int(np.argmax(y))
label = id2label.get(pred, f"class_{pred}")
print(f"Top-1 class index: {pred}")
print(f"Top-1 label: {label}")
if __name__ == "__main__":
main()
```
### BibTeX entry and citation info
```bibtex
@inproceedings{szegedy2016rethinking,
title={Rethinking the inception architecture for computer vision},
author={Szegedy, Christian and Vanhoucke, Vincent and Ioffe, Sergey and Shlens, Jon and Wojna, Zbigniew},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={2818--2826},
year={2016}
}
``` |