Update README.md

a73d4d4 verified about 17 hours ago

4.64 kB

	---
	license: apache-2.0
	tags:
	- image-classification
	- rotation-prediction
	- resnet
	- pytorch
	- vision
	datasets:
	- ILSVRC/imagenet-1k
	pipeline_tag: image-classification
	library_name: transformers
	---

	# 🔄 GyroScope — Image Rotation Prediction

	GyroScope is a ResNet-18 trained from scratch to detect whether an image is rotated by 0°, 90°, 180°, or 270° — and correct it automatically.

	> Is that photo upside down? Let GyroScope figure it out.

	---

	## 🎯 Task

	Given any image, GyroScope classifies its orientation into one of 4 classes:

	\| Label \| Meaning \| Correction \|
	\|-------\|---------\|------------\|
	\| 0 \| 0° — upright ✅ \| None \|
	\| 1 \| 90° CCW \| Rotate 270° CCW \|
	\| 2 \| 180° — upside down \| Rotate 180° \|
	\| 3 \| 270° CCW (= 90° CW) \| Rotate 90° CCW \|

	Correction formula: `correction = (360 − detected_angle) % 360`

	---

	## 📊 Benchmarks

	Trained on 50,000 images from [ImageNet-1k](https://huggingface.co/datasets/ILSVRC/imagenet-1k) × 4 rotations = 200k training samples.
	Validated on 5,000 images × 4 rotations = 20k validation samples.

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Overall Val Accuracy \| 79.81%% \|
	\| Per-class: 0° (upright) \| 79.8% \|
	\| Per-class: 90° CCW \| 80.1% \|
	\| Per-class: 180° \| 79.4% \|
	\| Per-class: 270° CCW \| 79.8% \|
	\| Training Epochs \| 12 \|
	\| Training Time \| ~4h (Kaggle T4 GPU) \|

	### Training Curve

	\| Epoch \| Train Acc \| Val Acc \|
	\|-------\|----------\|---------\|
	\| 1 \| 41.4% \| 43.2% \|
	\| 2 \| 52.0% \| 46.9% \|
	\| 3 \| 59.4% \| 62.8% \|
	\| 4 \| 64.1% \| 66.0% \|
	\| 5 \| 67.8% \| 69.48% \|
	\| 6 \| 70.6% \| 72.22% \|
	\| 7 \| 73.3% \| 74.25% \|
	\| 8 \| 75.6% \| 76.49% \|
	\| 9 \| 77.5% \| 77.47% \|
	\| 10 \| 79.1% \| 79.47% \|
	\| 11 \| 80.3% \| 79.78% \|
	\| 12 \| 80.9% \| 79.81% \|

	---

	## 🏗️ Architecture

	\| Detail \| Value \|
	\|--------\|-------\|
	\| Base \| ResNet-18 (from scratch, no pretrained weights) \|
	\| Parameters \| 11.2M \|
	\| Input \| 224 × 224 RGB \|
	\| Output \| 4 classes (0°, 90°, 180°, 270°) \|
	\| Framework \| 🤗 Hugging Face Transformers (`ResNetForImageClassification`) \|

	### Training Details

	- Optimizer: AdamW (lr=1e-3, weight_decay=0.05)
	- Scheduler: Cosine annealing with 1-epoch linear warmup
	- Loss: CrossEntropy with label smoothing (0.1)
	- Augmentations: RandomCrop, ColorJitter, RandomGrayscale, RandomErasing
	- ⚠️ No flips — horizontal/vertical flips would corrupt rotation labels
	- Mixed precision: FP16 via `torch.cuda.amp`

	---

	## 🚀 Quick Start

	### Installation

	```bash
	pip install transformers torch torchvision pillow requests
	```

	### Inference — Single Image from URL
	```bash
	python3 use_with_UI.py
	```

	--> Download `use_with_UI.py` first 😄

	## 💡 Example

	Input (rotated 180°):

	![cat image, rotated to the left by 90°](https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/My19YzxFOYQkftVHw1rPr.png)

	GyroScope Output:
	📐 Recognized: 90° \| Correction: 270°
	📊 Probs: {'0°': '0.0257', '90°': '0.8706', '180°': '0.0735', '270°': '0.0300'}
	<br>
	Corrected:

	![cat image, now correctly rotated](https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/QhpbxPuwhyMj19N7v34iZ.png)

	Original Image Source: [Link to Pexels](https://www.pexels.com/de-de/foto/ruhiger-schlaf-einer-getigerten-hauskatze-auf-einem-sofa-32441547/)

	## ⚠️ Limitations

	- Rotationally symmetric images (balls, textures, patterns) are inherently ambiguous — no model can reliably classify these.
	- Trained on natural images (ImageNet). Performance may degrade on:
	- Documents / text-heavy images
	- Medical imaging
	- Satellite / aerial imagery
	- Abstract art


	Only handles 90° increments — arbitrary angles (e.g. 45° or 135°) are not supported!
	Trained from scratch on 50k images — a pretrained backbone would likely yield higher accuracy (Finetuning).


	## 📝 Use Cases

	- 📸 Photo management — auto-correct phone/camera orientation
	- 🗂️ Data preprocessing — fix rotated images in scraped datasets
	- 🤖 ML pipelines — orientation normalization before feeding to downstream models
	- 🖼️ Digital archives — batch-correct scanned/uploaded images

	> Yesterday, I was sorting photos and like every photo was rotated wrong! This inspired me to make this tool 😂

	## 💻 Training code

	The full training code can be found in `train.py`. Have fun 😊

	## 📜 License

	Apache 2.0

	## 🙏 Acknowledgments

	- Dataset: ILSVRC/ImageNet-1k
	- Architecture: Microsoft ResNet via 🤗 Transformers
	- Trained on Kaggle (Tesla T4 GPU)

	---

	> GyroScope — because every image deserves to stand upright.