Upload fine-tuned MobileViT-DR with ONNX

Browse files

Files changed (6) hide show

README.md +76 -0
config.json +60 -0
mithu-vit.onnx +3 -0
model.safetensors +3 -0
preprocessor_config.json +18 -0
training_args.bin +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,76 @@

+---
+tags:
+- vision
+- image-classification
+- onnx
+- mobilevit
+- medical
+datasets:
+- rohithgowdax/processed-dr
+library_name: transformers
+widget:
+- src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/tiger.jpg
+  example_title: Example Eye Scan
+---
+# Mithu-ViT: Diabetic Retinopathy Classifier
+This is a **MobileViT (Small)** model fine-tuned on the [Processed Diabetic Retinopathy dataset](https://www.kaggle.com/datasets/rohithgowdax/processed-dr).
+It classifies retina scans into 5 severity levels:
+- **0**: No DR
+- **1**: Mild
+- **2**: Moderate
+- **3**: Severe
+- **4**: Proliferative DR
+## Model Details
+- **Architecture**: MobileViT-Small (Apple)
+- **Format**: PyTorch (`pytorch_model.bin`) and ONNX (`mithu-vit.onnx`)
+- **Resolution**: 256x256
+- **License**: Apache 2.0
+## Usage (PyTorch)
+```python
+from transformers import MobileViTForImageClassification, MobileViTImageProcessor
+from PIL import Image
+import torch
+# 1. Load Model
+model = MobileViTForImageClassification.from_pretrained("YOUR_USERNAME/mithu-mobilevit-dr")
+processor = MobileViTImageProcessor.from_pretrained("YOUR_USERNAME/mithu-mobilevit-dr")
+# 2. Load Image
+image = Image.open("path_to_eye_scan.jpg").convert("RGB")
+# 3. Predict
+inputs = processor(images=image, return_tensors="pt")
+with torch.no_grad():
+    outputs = model(**inputs)
+print("Predicted Class:", model.config.id2label[outputs.logits.argmax(-1).item()])
+```
+## Usage (ONNX)
+```python
+import onnxruntime as ort
+import numpy as np
+from PIL import Image
+# 1. Start Session
+session = ort.InferenceSession("mithu-vit.onnx")
+# 2. Prepare Input
+img = Image.open("test.jpg").resize((256, 256))
+img_data = np.array(img).transpose(2, 0, 1).astype(np.float32) / 255.0
+img_data = np.expand_dims(img_data, axis=0)
+# 3. Run
+outputs = session.run(None, {"pixel_values": img_data})
+print("Logits:", outputs[0])
+```

config.json ADDED Viewed

	@@ -0,0 +1,60 @@

+{
+  "architectures": [
+    "MobileViTForImageClassification"
+  ],
+  "aspp_dropout_prob": 0.1,
+  "aspp_out_channels": 256,
+  "atrous_rates": [
+    6,
+    12,
+    18
+  ],
+  "attention_probs_dropout_prob": 0.0,
+  "classifier_dropout_prob": 0.1,
+  "conv_kernel_size": 3,
+  "dtype": "float32",
+  "expand_ratio": 4.0,
+  "hidden_act": "silu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_sizes": [
+    144,
+    192,
+    240
+  ],
+  "id2label": {
+    "0": "0",
+    "1": "1",
+    "2": "2",
+    "3": "3",
+    "4": "4"
+  },
+  "image_size": 256,
+  "initializer_range": 0.02,
+  "label2id": {
+    "0": "0",
+    "1": "1",
+    "2": "2",
+    "3": "3",
+    "4": "4"
+  },
+  "layer_norm_eps": 1e-05,
+  "mlp_ratio": 2.0,
+  "model_type": "mobilevit",
+  "neck_hidden_sizes": [
+    16,
+    32,
+    64,
+    96,
+    128,
+    160,
+    640
+  ],
+  "num_attention_heads": 4,
+  "num_channels": 3,
+  "output_stride": 32,
+  "patch_size": 2,
+  "problem_type": "single_label_classification",
+  "qkv_bias": true,
+  "semantic_loss_ignore_index": 255,
+  "transformers_version": "4.57.1"
+}

mithu-vit.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2d29b9baa72245e419481fa0c225f55de5092546fbfc2f8ff6fd0e92ddb382ee
+size 20029305

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:94234ee5222fbdb8247a47c31b38994daf6a249edf350564134d70445c2b8095
+size 19859260

preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,18 @@

+{
+  "crop_size": {
+    "height": 256,
+    "width": 256
+  },
+  "do_center_crop": true,
+  "do_flip_channel_order": true,
+  "do_flip_channels": true,
+  "do_reduce_labels": false,
+  "do_rescale": true,
+  "do_resize": true,
+  "image_processor_type": "MobileViTImageProcessor",
+  "resample": 2,
+  "rescale_factor": 0.00392156862745098,
+  "size": {
+    "shortest_edge": 288
+  }
+}

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:684f7993cad9ee69fedd1c8ac84f26605ef06bd143357b7c1a287785fb61548d
+size 5841