| | --- |
| | tags: |
| | - image-classification |
| | - timm |
| | - MobileNetV4 |
| | license: apache-2.0 |
| | datasets: |
| | - imagenet-1k |
| | pipeline_tag: image-classification |
| | --- |
| | # Model card for MobileNetV4_Conv_Large_TFLite_256 |
| |
|
| | A MobileNet-V4 image classification model. Trained on ImageNet-1k by Ross Wightman. |
| |
|
| | Converted to TFLite Float32 & Float16 formats by Youssef Boulaouane. |
| |
|
| |
|
| | ## Model Details |
| | - **Pytorch Weights:** https://huggingface.co/timm/mobilenetv4_conv_large.e500_r256_in1k |
| | - **Model Type:** Image classification |
| | - **Model Stats:** |
| | - Params (M): 32.6 |
| | - GMACs: 2.9 |
| | - Activations (M): 12.1 |
| | - Input Shape (1, 256, 256, 3) |
| | - **Dataset:** ImageNet-1k |
| | - **Papers:** |
| | - MobileNetV4 -- Universal Models for the Mobile Ecosystem: https://arxiv.org/abs/2404.10518 |
| | - PyTorch Image Models: https://github.com/huggingface/pytorch-image-models |
| | - **Original:** https://github.com/tensorflow/models/tree/master/official/vision |
| |
|
| | ## Model Usage |
| | ### Image Classification in Python |
| | ```python |
| | import numpy as np |
| | import tensorflow as tf |
| | from PIL import Image |
| | |
| | # Load label file |
| | with open('imagenet_classes.txt', 'r') as file: |
| | lines = file.readlines() |
| | |
| | index_to_label = {index: line.strip() for index, line in enumerate(lines)} |
| | |
| | # Initialize interpreter and IO details |
| | tfl_model = tf.lite.Interpreter(model_path=tf_model_path) |
| | tfl_model.allocate_tensors() |
| | input_details = tfl_model.get_input_details() |
| | output_details = tfl_model.get_output_details() |
| | |
| | # Load and preprocess the image |
| | image = Image.open(image_path).resize((256, 256), Image.BICUBIC) |
| | |
| | image = np.array(image, dtype=np.float32) |
| | mean = np.array([0.485, 0.456, 0.406], dtype=np.float32) |
| | std = np.array([0.229, 0.224, 0.225], dtype=np.float32) |
| | image = (image / 255.0 - mean) / std |
| | |
| | image = np.expand_dims(image, axis=-1) |
| | image = np.rollaxis(image, 3) |
| | |
| | # Inference and postprocessing |
| | input = input_details[0] |
| | tfl_model.set_tensor(input["index"], image) |
| | tfl_model.invoke() |
| | |
| | tfl_output = tfl_model.get_tensor(output_details[0]["index"]) |
| | tfl_output_tensor = tf.convert_to_tensor(tfl_output) |
| | tfl_softmax_output = tf.nn.softmax(tfl_output_tensor, axis=1) |
| | |
| | tfl_top5_probs, tfl_top5_indices = tf.math.top_k(tfl_softmax_output, k=5) |
| | |
| | # Get the top5 class labels and probabilities |
| | tfl_probs_list = tfl_top5_probs[0].numpy().tolist() |
| | tfl_index_list = tfl_top5_indices[0].numpy().tolist() |
| | |
| | for index, prob in zip(tfl_index_list, tfl_probs_list): |
| | print(f"{index_to_label[index]}: {round(prob*100, 2)}%") |
| | ``` |
| |
|
| | ### Deployment on Mobile |
| | Refer to guides available here: https://ai.google.dev/edge/lite/inference |
| |
|
| | ## Citation |
| | ```bibtex |
| | @article{qin2024mobilenetv4, |
| | title={MobileNetV4-Universal Models for the Mobile Ecosystem}, |
| | author={Qin, Danfeng and Leichner, Chas and Delakis, Manolis and Fornoni, Marco and Luo, Shixin and Yang, Fan and Wang, Weijun and Banbury, Colby and Ye, Chengxi and Akin, Berkin and others}, |
| | journal={arXiv preprint arXiv:2404.10518}, |
| | year={2024} |
| | } |
| | ``` |
| | ```bibtex |
| | @misc{rw2019timm, |
| | author = {Ross Wightman}, |
| | title = {PyTorch Image Models}, |
| | year = {2019}, |
| | publisher = {GitHub}, |
| | journal = {GitHub repository}, |
| | doi = {10.5281/zenodo.4414861}, |
| | howpublished = {\url{https://github.com/huggingface/pytorch-image-models}} |
| | } |
| | ``` |