File size: 4,638 Bytes
5e2f346
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7f802a1
 
 
 
 
5e2f346
 
 
 
 
 
 
 
 
 
 
 
fc212d6
d0000f0
 
9687be0
925f626
7f802a1
 
5e2f346
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a73d4d4
5e2f346
 
a73d4d4
 
5e2f346
 
 
 
 
 
 
97987d0
 
5e2f346
 
 
 
 
80e4e04
 
5e2f346
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80e4e04
 
 
 
5e2f346
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
---
license: apache-2.0
tags:
  - image-classification
  - rotation-prediction
  - resnet
  - pytorch
  - vision
datasets:
  - ILSVRC/imagenet-1k
pipeline_tag: image-classification
library_name: transformers
---

# πŸ”„ GyroScope β€” Image Rotation Prediction

**GyroScope** is a ResNet-18 trained **from scratch** to detect whether an image is rotated by **0Β°, 90Β°, 180Β°, or 270Β°** β€” and correct it automatically.

> Is that photo upside down? Let GyroScope figure it out.

---

## 🎯 Task

Given any image, GyroScope classifies its orientation into one of **4 classes**:

| Label | Meaning | Correction |
|-------|---------|------------|
| 0 | 0Β° β€” upright βœ… | None |
| 1 | 90Β° CCW | Rotate 270Β° CCW |
| 2 | 180Β° β€” upside down | Rotate 180Β° |
| 3 | 270Β° CCW (= 90Β° CW) | Rotate 90Β° CCW |

**Correction formula:** `correction = (360 βˆ’ detected_angle) % 360`

---

## πŸ“Š Benchmarks

Trained on **50,000 images** from [ImageNet-1k](https://huggingface.co/datasets/ILSVRC/imagenet-1k) Γ— 4 rotations = **200k training samples**.  
Validated on **5,000 images** Γ— 4 rotations = **20k validation samples**.

| Metric | Value |
|--------|-------|
| **Overall Val Accuracy** | **79.81%%** |
| Per-class: 0Β° (upright) | 79.8% |
| Per-class: 90Β° CCW | 80.1% |
| Per-class: 180Β° | 79.4% |
| Per-class: 270Β° CCW | 79.8% |
| Training Epochs | 12 |
| Training Time | ~4h (Kaggle T4 GPU) |

### Training Curve

| Epoch | Train Acc | Val Acc |
|-------|----------|---------|
| 1 | 41.4% | 43.2% |
| 2 | 52.0% | 46.9% |
| 3 | 59.4% | 62.8% |
| 4 | 64.1% | 66.0% |
| 5 | 67.8% | 69.48% |
| 6 | 70.6% | 72.22% |
| 7 | 73.3% | 74.25% |
| 8 | 75.6% | 76.49% |
| 9 | 77.5% | 77.47% |
| 10 | 79.1% | 79.47% |
| 11 | 80.3% | 79.78% |
| 12 | 80.9% | 79.81% |

---

## πŸ—οΈ Architecture

| Detail | Value |
|--------|-------|
| Base | ResNet-18 (from scratch, **no pretrained weights**) |
| Parameters | 11.2M |
| Input | 224 Γ— 224 RGB |
| Output | 4 classes (0Β°, 90Β°, 180Β°, 270Β°) |
| Framework | πŸ€— Hugging Face Transformers (`ResNetForImageClassification`) |

### Training Details

- **Optimizer:** AdamW (lr=1e-3, weight_decay=0.05)
- **Scheduler:** Cosine annealing with 1-epoch linear warmup
- **Loss:** CrossEntropy with label smoothing (0.1)
- **Augmentations:** RandomCrop, ColorJitter, RandomGrayscale, RandomErasing
- **⚠️ No flips** β€” horizontal/vertical flips would corrupt rotation labels
- **Mixed precision:** FP16 via `torch.cuda.amp`

---

## πŸš€ Quick Start

### Installation

```bash
pip install transformers torch torchvision pillow requests
```

### Inference β€” Single Image from URL
```bash
python3 use_with_UI.py
```

--> Download `use_with_UI.py` first πŸ˜„

## πŸ’‘ Example

Input (rotated 180Β°):

![cat image, rotated to the left by 90Β°](https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/My19YzxFOYQkftVHw1rPr.png)

GyroScope Output:
**πŸ“ Recognized: 90Β° | Correction: 270Β°**
**πŸ“Š Probs: {'0Β°': '0.0257', '90Β°': '0.8706', '180Β°': '0.0735', '270Β°': '0.0300'}**
<br>
Corrected:

![cat image, now correctly rotated](https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/QhpbxPuwhyMj19N7v34iZ.png)

*Original Image Source: [Link to Pexels](https://www.pexels.com/de-de/foto/ruhiger-schlaf-einer-getigerten-hauskatze-auf-einem-sofa-32441547/)*

## ⚠️ Limitations

- Rotationally symmetric images (balls, textures, patterns) are inherently ambiguous β€” no model can reliably classify these.
- Trained on natural images (ImageNet). Performance may degrade on:
- Documents / text-heavy images
- Medical imaging
- Satellite / aerial imagery
- Abstract art


Only handles 90Β° increments β€” arbitrary angles (e.g. 45Β° or 135Β°) are **not supported**!
Trained from scratch on 50k images β€” a pretrained backbone would likely yield higher accuracy (Finetuning).


## πŸ“ Use Cases

- πŸ“Έ Photo management β€” auto-correct phone/camera orientation
- πŸ—‚οΈ Data preprocessing β€” fix rotated images in scraped datasets
- πŸ€– ML pipelines β€” orientation normalization before feeding to downstream models
- πŸ–ΌοΈ Digital archives β€” batch-correct scanned/uploaded images

> Yesterday, I was sorting photos and like every photo was rotated wrong! This inspired me to make this tool πŸ˜‚

## πŸ’» Training code

The full training code can be found in `train.py`. Have fun 😊

## πŸ“œ License

Apache 2.0

## πŸ™ Acknowledgments

- Dataset: ILSVRC/ImageNet-1k
- Architecture: Microsoft ResNet via πŸ€— Transformers
- Trained on Kaggle (Tesla T4 GPU)

---

>  GyroScope β€” because every image deserves to stand upright.