metadata
language: en
license: apache-2.0
pipeline_tag: image-classification
tags:
- computer-vision
- image-classification
- mobilenet-v2
- cifar100
- whirlwindai
datasets:
- cifar100
metrics:
- accuracy
Vision, Simplified.
Small models can recognize more than their size suggests.
GVM explores efficient computer vision using lightweight architectures, fast inference, and practical deployment.
Designed to run almost anywhere.
Classification Performance
| Epoch | Training Loss | Validation Accuracy |
|---|---|---|
| 1 | 3.36 | 41.75% |
| 2 | 2.78 | 47.14% |
| 3 | 2.64 | 47.40% |
Quick Start
import torch
import torchvision.transforms as transforms
import timm
import requests
import json
from PIL import Image
config = json.loads(
requests.get(
"https://huggingface.co/WhirlwindAI/GVM/resolve/main/config.json"
).text
)
model = timm.create_model(
"mobilenetv2_100",
pretrained=False,
num_classes=config["num_classes"]
)
state = torch.hub.load_state_dict_from_url(
"https://huggingface.co/WhirlwindAI/GVM/resolve/main/model.pth",
map_location="cpu"
)
model.load_state_dict(state)
model.eval()
transform = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.485,0.456,0.406],
std=[0.229,0.224,0.225]
)
])
image = Image.open("image.jpg").convert("RGB")
tensor = transform(image).unsqueeze(0)
prediction = model(tensor).argmax(1).item()
print(config["class_names"][prediction])
Highlights
| Architecture | MobileNetV2 |
| Dataset | CIFAR-100 |
| Classes | 100 |
| Model Size | 14 MB |
| Framework | PyTorch |
| Inference | CPU & GPU Friendly |
Repository Contents
model.pth
config.json
README.md
Current Limitations
- Trained for only 3 epochs
- Frozen backbone during training
- CIFAR-100 is considerably harder than CIFAR-10
- Intended as an efficient baseline rather than a state-of-the-art classifier
Roadmap
- Higher resolution training
- Full backbone fine-tuning
- Improved augmentation
- ONNX export
- TensorRT support
- Interactive demo