GyroScope / README.md
LH-Tech-AI's picture
Update README.md
a73d4d4 verified
---
license: apache-2.0
tags:
- image-classification
- rotation-prediction
- resnet
- pytorch
- vision
datasets:
- ILSVRC/imagenet-1k
pipeline_tag: image-classification
library_name: transformers
---
# πŸ”„ GyroScope β€” Image Rotation Prediction
**GyroScope** is a ResNet-18 trained **from scratch** to detect whether an image is rotated by **0Β°, 90Β°, 180Β°, or 270Β°** β€” and correct it automatically.
> Is that photo upside down? Let GyroScope figure it out.
---
## 🎯 Task
Given any image, GyroScope classifies its orientation into one of **4 classes**:
| Label | Meaning | Correction |
|-------|---------|------------|
| 0 | 0Β° β€” upright βœ… | None |
| 1 | 90Β° CCW | Rotate 270Β° CCW |
| 2 | 180Β° β€” upside down | Rotate 180Β° |
| 3 | 270Β° CCW (= 90Β° CW) | Rotate 90Β° CCW |
**Correction formula:** `correction = (360 βˆ’ detected_angle) % 360`
---
## πŸ“Š Benchmarks
Trained on **50,000 images** from [ImageNet-1k](https://huggingface.co/datasets/ILSVRC/imagenet-1k) Γ— 4 rotations = **200k training samples**.
Validated on **5,000 images** Γ— 4 rotations = **20k validation samples**.
| Metric | Value |
|--------|-------|
| **Overall Val Accuracy** | **79.81%%** |
| Per-class: 0Β° (upright) | 79.8% |
| Per-class: 90Β° CCW | 80.1% |
| Per-class: 180Β° | 79.4% |
| Per-class: 270Β° CCW | 79.8% |
| Training Epochs | 12 |
| Training Time | ~4h (Kaggle T4 GPU) |
### Training Curve
| Epoch | Train Acc | Val Acc |
|-------|----------|---------|
| 1 | 41.4% | 43.2% |
| 2 | 52.0% | 46.9% |
| 3 | 59.4% | 62.8% |
| 4 | 64.1% | 66.0% |
| 5 | 67.8% | 69.48% |
| 6 | 70.6% | 72.22% |
| 7 | 73.3% | 74.25% |
| 8 | 75.6% | 76.49% |
| 9 | 77.5% | 77.47% |
| 10 | 79.1% | 79.47% |
| 11 | 80.3% | 79.78% |
| 12 | 80.9% | 79.81% |
---
## πŸ—οΈ Architecture
| Detail | Value |
|--------|-------|
| Base | ResNet-18 (from scratch, **no pretrained weights**) |
| Parameters | 11.2M |
| Input | 224 Γ— 224 RGB |
| Output | 4 classes (0Β°, 90Β°, 180Β°, 270Β°) |
| Framework | πŸ€— Hugging Face Transformers (`ResNetForImageClassification`) |
### Training Details
- **Optimizer:** AdamW (lr=1e-3, weight_decay=0.05)
- **Scheduler:** Cosine annealing with 1-epoch linear warmup
- **Loss:** CrossEntropy with label smoothing (0.1)
- **Augmentations:** RandomCrop, ColorJitter, RandomGrayscale, RandomErasing
- **⚠️ No flips** β€” horizontal/vertical flips would corrupt rotation labels
- **Mixed precision:** FP16 via `torch.cuda.amp`
---
## πŸš€ Quick Start
### Installation
```bash
pip install transformers torch torchvision pillow requests
```
### Inference β€” Single Image from URL
```bash
python3 use_with_UI.py
```
--> Download `use_with_UI.py` first πŸ˜„
## πŸ’‘ Example
Input (rotated 180Β°):
![cat image, rotated to the left by 90Β°](https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/My19YzxFOYQkftVHw1rPr.png)
GyroScope Output:
**πŸ“ Recognized: 90Β° | Correction: 270Β°**
**πŸ“Š Probs: {'0Β°': '0.0257', '90Β°': '0.8706', '180Β°': '0.0735', '270Β°': '0.0300'}**
<br>
Corrected:
![cat image, now correctly rotated](https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/QhpbxPuwhyMj19N7v34iZ.png)
*Original Image Source: [Link to Pexels](https://www.pexels.com/de-de/foto/ruhiger-schlaf-einer-getigerten-hauskatze-auf-einem-sofa-32441547/)*
## ⚠️ Limitations
- Rotationally symmetric images (balls, textures, patterns) are inherently ambiguous β€” no model can reliably classify these.
- Trained on natural images (ImageNet). Performance may degrade on:
- Documents / text-heavy images
- Medical imaging
- Satellite / aerial imagery
- Abstract art
Only handles 90Β° increments β€” arbitrary angles (e.g. 45Β° or 135Β°) are **not supported**!
Trained from scratch on 50k images β€” a pretrained backbone would likely yield higher accuracy (Finetuning).
## πŸ“ Use Cases
- πŸ“Έ Photo management β€” auto-correct phone/camera orientation
- πŸ—‚οΈ Data preprocessing β€” fix rotated images in scraped datasets
- πŸ€– ML pipelines β€” orientation normalization before feeding to downstream models
- πŸ–ΌοΈ Digital archives β€” batch-correct scanned/uploaded images
> Yesterday, I was sorting photos and like every photo was rotated wrong! This inspired me to make this tool πŸ˜‚
## πŸ’» Training code
The full training code can be found in `train.py`. Have fun 😊
## πŸ“œ License
Apache 2.0
## πŸ™ Acknowledgments
- Dataset: ILSVRC/ImageNet-1k
- Architecture: Microsoft ResNet via πŸ€— Transformers
- Trained on Kaggle (Tesla T4 GPU)
---
> GyroScope β€” because every image deserves to stand upright.