GyroScope / README.md
LH-Tech-AI's picture
Update README.md
a73d4d4 verified
metadata
license: apache-2.0
tags:
  - image-classification
  - rotation-prediction
  - resnet
  - pytorch
  - vision
datasets:
  - ILSVRC/imagenet-1k
pipeline_tag: image-classification
library_name: transformers

πŸ”„ GyroScope β€” Image Rotation Prediction

GyroScope is a ResNet-18 trained from scratch to detect whether an image is rotated by 0Β°, 90Β°, 180Β°, or 270Β° β€” and correct it automatically.

Is that photo upside down? Let GyroScope figure it out.


🎯 Task

Given any image, GyroScope classifies its orientation into one of 4 classes:

Label Meaning Correction
0 0Β° β€” upright βœ… None
1 90Β° CCW Rotate 270Β° CCW
2 180Β° β€” upside down Rotate 180Β°
3 270Β° CCW (= 90Β° CW) Rotate 90Β° CCW

Correction formula: correction = (360 βˆ’ detected_angle) % 360


πŸ“Š Benchmarks

Trained on 50,000 images from ImageNet-1k Γ— 4 rotations = 200k training samples.
Validated on 5,000 images Γ— 4 rotations = 20k validation samples.

Metric Value
Overall Val Accuracy 79.81%%
Per-class: 0Β° (upright) 79.8%
Per-class: 90Β° CCW 80.1%
Per-class: 180Β° 79.4%
Per-class: 270Β° CCW 79.8%
Training Epochs 12
Training Time ~4h (Kaggle T4 GPU)

Training Curve

Epoch Train Acc Val Acc
1 41.4% 43.2%
2 52.0% 46.9%
3 59.4% 62.8%
4 64.1% 66.0%
5 67.8% 69.48%
6 70.6% 72.22%
7 73.3% 74.25%
8 75.6% 76.49%
9 77.5% 77.47%
10 79.1% 79.47%
11 80.3% 79.78%
12 80.9% 79.81%

πŸ—οΈ Architecture

Detail Value
Base ResNet-18 (from scratch, no pretrained weights)
Parameters 11.2M
Input 224 Γ— 224 RGB
Output 4 classes (0Β°, 90Β°, 180Β°, 270Β°)
Framework πŸ€— Hugging Face Transformers (ResNetForImageClassification)

Training Details

  • Optimizer: AdamW (lr=1e-3, weight_decay=0.05)
  • Scheduler: Cosine annealing with 1-epoch linear warmup
  • Loss: CrossEntropy with label smoothing (0.1)
  • Augmentations: RandomCrop, ColorJitter, RandomGrayscale, RandomErasing
  • ⚠️ No flips β€” horizontal/vertical flips would corrupt rotation labels
  • Mixed precision: FP16 via torch.cuda.amp

πŸš€ Quick Start

Installation

pip install transformers torch torchvision pillow requests

Inference β€” Single Image from URL

python3 use_with_UI.py

--> Download use_with_UI.py first πŸ˜„

πŸ’‘ Example

Input (rotated 180Β°):

cat image, rotated to the left by 90Β°

GyroScope Output: πŸ“ Recognized: 90Β° | Correction: 270Β° πŸ“Š Probs: {'0Β°': '0.0257', '90Β°': '0.8706', '180Β°': '0.0735', '270Β°': '0.0300'}
Corrected:

cat image, now correctly rotated

Original Image Source: Link to Pexels

⚠️ Limitations

  • Rotationally symmetric images (balls, textures, patterns) are inherently ambiguous β€” no model can reliably classify these.
  • Trained on natural images (ImageNet). Performance may degrade on:
  • Documents / text-heavy images
  • Medical imaging
  • Satellite / aerial imagery
  • Abstract art

Only handles 90Β° increments β€” arbitrary angles (e.g. 45Β° or 135Β°) are not supported! Trained from scratch on 50k images β€” a pretrained backbone would likely yield higher accuracy (Finetuning).

πŸ“ Use Cases

  • πŸ“Έ Photo management β€” auto-correct phone/camera orientation
  • πŸ—‚οΈ Data preprocessing β€” fix rotated images in scraped datasets
  • πŸ€– ML pipelines β€” orientation normalization before feeding to downstream models
  • πŸ–ΌοΈ Digital archives β€” batch-correct scanned/uploaded images

Yesterday, I was sorting photos and like every photo was rotated wrong! This inspired me to make this tool πŸ˜‚

πŸ’» Training code

The full training code can be found in train.py. Have fun 😊

πŸ“œ License

Apache 2.0

πŸ™ Acknowledgments

  • Dataset: ILSVRC/ImageNet-1k
  • Architecture: Microsoft ResNet via πŸ€— Transformers
  • Trained on Kaggle (Tesla T4 GPU)

GyroScope β€” because every image deserves to stand upright.