LH-Tech-AI commited on
Commit
5e2f346
Β·
verified Β·
1 Parent(s): 5687750

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +158 -0
README.md ADDED
@@ -0,0 +1,158 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - image-classification
5
+ - rotation-prediction
6
+ - resnet
7
+ - pytorch
8
+ - vision
9
+ datasets:
10
+ - ILSVRC/imagenet-1k
11
+ pipeline_tag: image-classification
12
+ library_name: transformers
13
+ ---
14
+
15
+ # πŸ”„ GyroScope β€” Image Rotation Prediction
16
+
17
+ **GyroScope** is a ResNet-18 trained **from scratch** to detect whether an image is rotated by **0Β°, 90Β°, 180Β°, or 270Β°** β€” and correct it automatically.
18
+
19
+ > Is that photo upside down? Let GyroScope figure it out.
20
+
21
+ ---
22
+
23
+ ## 🎯 Task
24
+
25
+ Given any image, GyroScope classifies its orientation into one of **4 classes**:
26
+
27
+ | Label | Meaning | Correction |
28
+ |-------|---------|------------|
29
+ | 0 | 0Β° β€” upright βœ… | None |
30
+ | 1 | 90Β° CCW | Rotate 270Β° CCW |
31
+ | 2 | 180Β° β€” upside down | Rotate 180Β° |
32
+ | 3 | 270Β° CCW (= 90Β° CW) | Rotate 90Β° CCW |
33
+
34
+ **Correction formula:** `correction = (360 βˆ’ detected_angle) % 360`
35
+
36
+ ---
37
+
38
+ ## πŸ“Š Benchmarks
39
+
40
+ Trained on **50,000 images** from [ImageNet-1k](https://huggingface.co/datasets/ILSVRC/imagenet-1k) Γ— 4 rotations = **200k training samples**.
41
+ Validated on **5,000 images** Γ— 4 rotations = **20k validation samples**.
42
+
43
+ | Metric | Value |
44
+ |--------|-------|
45
+ | **Overall Val Accuracy** | **XX.XX%** |
46
+ | Per-class: 0Β° (upright) | XX.X% |
47
+ | Per-class: 90Β° CCW | XX.X% |
48
+ | Per-class: 180Β° | XX.X% |
49
+ | Per-class: 270Β° CCW | XX.X% |
50
+ | Training Epochs | 12 |
51
+ | Training Time | ~4h (Kaggle T4 GPU) |
52
+
53
+ <!-- UPDATE the table above with your final results -->
54
+
55
+ ### Training Curve
56
+
57
+ | Epoch | Train Acc | Val Acc |
58
+ |-------|----------|---------|
59
+ | 1 | 41.4% | 43.2% |
60
+ | 2 | 52.0% | 46.9% |
61
+ | 3 | 59.4% | 62.8% |
62
+ | 4 | 64.1% | 66.0% |
63
+ | 5 | 67.8% | 69.48% |
64
+ | 6 | β€” | β€” |
65
+ | 7 | β€” | β€” |
66
+ | 8 | β€” | β€” |
67
+ | 9 | β€” | β€” |
68
+ | 10 | β€” | β€” |
69
+ | 11 | β€” | β€” |
70
+ | 12 | β€” | β€” |
71
+
72
+ <!-- UPDATE with final epoch results -->
73
+
74
+ ---
75
+
76
+ ## πŸ—οΈ Architecture
77
+
78
+ | Detail | Value |
79
+ |--------|-------|
80
+ | Base | ResNet-18 (from scratch, **no pretrained weights**) |
81
+ | Parameters | 11.2M |
82
+ | Input | 224 Γ— 224 RGB |
83
+ | Output | 4 classes (0Β°, 90Β°, 180Β°, 270Β°) |
84
+ | Framework | πŸ€— Hugging Face Transformers (`ResNetForImageClassification`) |
85
+
86
+ ### Training Details
87
+
88
+ - **Optimizer:** AdamW (lr=1e-3, weight_decay=0.05)
89
+ - **Scheduler:** Cosine annealing with 1-epoch linear warmup
90
+ - **Loss:** CrossEntropy with label smoothing (0.1)
91
+ - **Augmentations:** RandomCrop, ColorJitter, RandomGrayscale, RandomErasing
92
+ - **⚠️ No flips** β€” horizontal/vertical flips would corrupt rotation labels
93
+ - **Mixed precision:** FP16 via `torch.cuda.amp`
94
+
95
+ ---
96
+
97
+ ## πŸš€ Quick Start
98
+
99
+ ### Installation
100
+
101
+ ```bash
102
+ pip install transformers torch torchvision pillow requests
103
+ ```
104
+
105
+ ### Inference β€” Single Image from URL
106
+ ```bash
107
+ python3 use.py
108
+ ```
109
+
110
+ ## πŸ’‘ Example
111
+
112
+ Input (rotated 180Β°):
113
+
114
+ ![cat image, rotated to the left by 90Β°](https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/My19YzxFOYQkftVHw1rPr.png)
115
+
116
+ GyroScope Output:
117
+ <!--OUTPUT HERE-->
118
+ <br>
119
+ Corrected:
120
+
121
+ ![cat image, now correctly rotated](https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/QhpbxPuwhyMj19N7v34iZ.png)
122
+
123
+ ## ⚠️ Limitations
124
+
125
+ - Rotationally symmetric images (balls, textures, patterns) are inherently ambiguous β€” no model can reliably classify these.
126
+ - Trained on natural images (ImageNet). Performance may degrade on:
127
+ - Documents / text-heavy images
128
+ - Medical imaging
129
+ - Satellite / aerial imagery
130
+ - Abstract art
131
+
132
+
133
+ Only handles 90Β° increments β€” arbitrary angles (e.g. 45Β° or 135Β°) are **not supported**!
134
+ Trained from scratch on 50k images β€” a pretrained backbone would likely yield higher accuracy (Finetuning).
135
+
136
+
137
+ ## πŸ“ Use Cases
138
+
139
+ - πŸ“Έ Photo management β€” auto-correct phone/camera orientation
140
+ - πŸ—‚οΈ Data preprocessing β€” fix rotated images in scraped datasets
141
+ - πŸ€– ML pipelines β€” orientation normalization before feeding to downstream models
142
+ - πŸ–ΌοΈ Digital archives β€” batch-correct scanned/uploaded images
143
+
144
+ > Yesterday, I was sorting photos and like every photo was rotated wrong! This inspired me to make this tool πŸ˜‚
145
+
146
+ ## πŸ“œ License
147
+
148
+ Apache 2.0
149
+
150
+ ## πŸ™ Acknowledgments
151
+
152
+ - Dataset: ILSVRC/ImageNet-1k
153
+ - Architecture: Microsoft ResNet via πŸ€— Transformers
154
+ - Trained on Kaggle (Tesla T4 GPU)
155
+
156
+ ---
157
+
158
+ > GyroScope β€” because every image deserves to stand upright.