--- license: apache-2.0 tags: - image-classification - rotation-prediction - resnet - pytorch - vision datasets: - ILSVRC/imagenet-1k pipeline_tag: image-classification library_name: transformers --- # 🔄 GyroScope — Image Rotation Prediction **GyroScope** is a ResNet-18 trained **from scratch** to detect whether an image is rotated by **0°, 90°, 180°, or 270°** — and correct it automatically. > Is that photo upside down? Let GyroScope figure it out. --- ## 🎯 Task Given any image, GyroScope classifies its orientation into one of **4 classes**: | Label | Meaning | Correction | |-------|---------|------------| | 0 | 0° — upright ✅ | None | | 1 | 90° CCW | Rotate 270° CCW | | 2 | 180° — upside down | Rotate 180° | | 3 | 270° CCW (= 90° CW) | Rotate 90° CCW | **Correction formula:** `correction = (360 − detected_angle) % 360` --- ## 📊 Benchmarks Trained on **50,000 images** from [ImageNet-1k](https://huggingface.co/datasets/ILSVRC/imagenet-1k) × 4 rotations = **200k training samples**. Validated on **5,000 images** × 4 rotations = **20k validation samples**. | Metric | Value | |--------|-------| | **Overall Val Accuracy** | **79.81%%** | | Per-class: 0° (upright) | 79.8% | | Per-class: 90° CCW | 80.1% | | Per-class: 180° | 79.4% | | Per-class: 270° CCW | 79.8% | | Training Epochs | 12 | | Training Time | ~4h (Kaggle T4 GPU) | ### Training Curve | Epoch | Train Acc | Val Acc | |-------|----------|---------| | 1 | 41.4% | 43.2% | | 2 | 52.0% | 46.9% | | 3 | 59.4% | 62.8% | | 4 | 64.1% | 66.0% | | 5 | 67.8% | 69.48% | | 6 | 70.6% | 72.22% | | 7 | 73.3% | 74.25% | | 8 | 75.6% | 76.49% | | 9 | 77.5% | 77.47% | | 10 | 79.1% | 79.47% | | 11 | 80.3% | 79.78% | | 12 | 80.9% | 79.81% | --- ## 🏗️ Architecture | Detail | Value | |--------|-------| | Base | ResNet-18 (from scratch, **no pretrained weights**) | | Parameters | 11.2M | | Input | 224 × 224 RGB | | Output | 4 classes (0°, 90°, 180°, 270°) | | Framework | 🤗 Hugging Face Transformers (`ResNetForImageClassification`) | ### Training Details - **Optimizer:** AdamW (lr=1e-3, weight_decay=0.05) - **Scheduler:** Cosine annealing with 1-epoch linear warmup - **Loss:** CrossEntropy with label smoothing (0.1) - **Augmentations:** RandomCrop, ColorJitter, RandomGrayscale, RandomErasing - **⚠️ No flips** — horizontal/vertical flips would corrupt rotation labels - **Mixed precision:** FP16 via `torch.cuda.amp` --- ## 🚀 Quick Start ### Installation ```bash pip install transformers torch torchvision pillow requests ``` ### Inference — Single Image from URL ```bash python3 use_with_UI.py ``` --> Download `use_with_UI.py` first 😄 ## 💡 Example Input (rotated 180°): ![cat image, rotated to the left by 90°](https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/My19YzxFOYQkftVHw1rPr.png) GyroScope Output: **📐 Recognized: 90° | Correction: 270°** **📊 Probs: {'0°': '0.0257', '90°': '0.8706', '180°': '0.0735', '270°': '0.0300'}**
Corrected: ![cat image, now correctly rotated](https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/QhpbxPuwhyMj19N7v34iZ.png) *Original Image Source: [Link to Pexels](https://www.pexels.com/de-de/foto/ruhiger-schlaf-einer-getigerten-hauskatze-auf-einem-sofa-32441547/)* ## ⚠️ Limitations - Rotationally symmetric images (balls, textures, patterns) are inherently ambiguous — no model can reliably classify these. - Trained on natural images (ImageNet). Performance may degrade on: - Documents / text-heavy images - Medical imaging - Satellite / aerial imagery - Abstract art Only handles 90° increments — arbitrary angles (e.g. 45° or 135°) are **not supported**! Trained from scratch on 50k images — a pretrained backbone would likely yield higher accuracy (Finetuning). ## 📝 Use Cases - 📸 Photo management — auto-correct phone/camera orientation - 🗂️ Data preprocessing — fix rotated images in scraped datasets - 🤖 ML pipelines — orientation normalization before feeding to downstream models - 🖼️ Digital archives — batch-correct scanned/uploaded images > Yesterday, I was sorting photos and like every photo was rotated wrong! This inspired me to make this tool 😂 ## 💻 Training code The full training code can be found in `train.py`. Have fun 😊 ## 📜 License Apache 2.0 ## 🙏 Acknowledgments - Dataset: ILSVRC/ImageNet-1k - Architecture: Microsoft ResNet via 🤗 Transformers - Trained on Kaggle (Tesla T4 GPU) --- > GyroScope — because every image deserves to stand upright.