toderian commited on
Commit
f51d9ed
·
verified ·
1 Parent(s): 6b8d044

Upload folder using huggingface_hub

Browse files
Files changed (6) hide show
  1. README.md +219 -212
  2. config.json +34 -24
  3. model.py +244 -0
  4. model.safetensors +3 -0
  5. preprocessor_config.json +14 -0
  6. pytorch_model.bin +3 -0
README.md CHANGED
@@ -1,270 +1,277 @@
1
- # Cervical Type Classification - Model Training
2
-
3
- ## Overview
4
-
5
- This project classifies cervical images into 3 transformation zone types:
6
- - **Type 1**: Fully visible squamocolumnar junction (SCJ)
7
- - **Type 2**: Partially visible SCJ
8
- - **Type 3**: SCJ not visible (inside cervical canal)
9
-
10
- ## Best Model Summary
11
-
12
- | Metric | Value |
13
- |--------|-------|
14
- | **Validation Accuracy** | **65.52%** |
15
- | **Macro F1** | **65.61%** |
16
- | Best Epoch | 34 |
17
- | Total Parameters | 1,327,235 |
18
-
19
  ---
20
 
21
- ## Best Model Configuration
22
-
23
- **Run Name:** `L32_64_128_256_Res_SE_lr5e-04_d0.3`
24
-
25
- ### Architecture
26
 
27
- | Component | Value |
28
- |-----------|-------|
29
- | Conv Layers | [32, 64, 128, 256] |
30
- | FC Layers | [256, 128] |
31
- | Kernel Size | 3x3 |
32
- | Pooling | MaxPool 2x2 |
33
- | Batch Normalization | Yes |
34
- | Activation | ReLU |
35
- | Residual Connections | **Yes** |
36
- | SE Attention | **Yes** |
37
 
38
- ### Training Settings
39
 
40
- | Parameter | Value |
41
- |-----------|-------|
42
- | Learning Rate | 5e-4 |
43
- | Weight Decay | 1e-4 |
44
- | Dropout | 0.3 |
45
- | Batch Size | 32 |
46
- | Focal Loss Gamma | 2.0 |
47
- | Label Smoothing | 0.1 |
48
- | Data Augmentation | Yes |
49
 
50
  ---
51
 
52
- ## Performance Metrics
53
-
54
- ### Per-Class Metrics
55
 
56
- | Class | Precision | Recall | F1-Score | Support |
57
- |-------|-----------|--------|----------|---------|
58
- | **Type 1** | 79.26% | 61.49% | 69.26% | 348 |
59
- | **Type 2** | 58.09% | 75.29% | 65.58% | 348 |
60
- | **Type 3** | 64.40% | 59.77% | 62.00% | 348 |
61
- | **Macro Avg** | 67.25% | 65.52% | **65.61%** | 1044 |
62
 
63
- ### Confusion Matrix
64
 
65
  ```
66
- Predicted
67
- Type 1 Type 2 Type 3
68
- Actual Type 1 214 84 50
69
- Type 2 21 262 65
70
- Type 3 35 105 208
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
  ```
72
 
73
- ### Interpretation
74
-
75
- | Finding | Implication |
76
- |---------|-------------|
77
- | Type 1 has highest precision (79%) | When model predicts Type 1, it's usually correct |
78
- | Type 2 has highest recall (75%) | Model catches most Type 2 cases |
79
- | Type 3 has lowest metrics | Hardest to classify - often confused with Type 2 |
80
- | Type 2 ↔ Type 3 confusion is common | 105 Type 3 misclassified as Type 2 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
 
82
  ---
83
 
84
- ## Grid Search Results
85
-
86
- A grid search of 32 configurations was performed on January 17, 2026.
87
-
88
- ### Search Space
89
-
90
- | Parameter | Values Tested |
91
- |-----------|---------------|
92
- | Conv Layers | [32,64,128,256], [64,128,256] |
93
- | Learning Rate | 5e-4, 1e-4 |
94
- | Dropout | 0.3, 0.4 |
95
- | Residual | Yes, No |
96
- | SE Attention | Yes, No |
97
-
98
- ### Top 10 Configurations
99
-
100
- | Rank | Configuration | Accuracy | Key Features |
101
- |------|--------------|----------|--------------|
102
- | 1 | L32_64_128_256_Res_SE_lr5e-04_d0.3 | **65.52%** | 4-layer, Res+SE |
103
- | 2 | L64_128_256_Res_SE_lr5e-04_d0.3 | 65.04% | 3-layer, Res+SE |
104
- | 3 | L32_64_128_256_Res_SE_lr1e-04_d0.3 | 64.94% | 4-layer, lower LR |
105
- | 4 | L64_128_256_Res_SE_lr1e-04_d0.3 | 64.37% | 3-layer, lower LR |
106
- | 5 | L32_64_128_256_Res_SE_lr5e-04_d0.4 | 64.18% | Higher dropout |
107
- | 6 | L32_64_128_256_Res_lr5e-04_d0.4 | 64.08% | No SE |
108
- | 7 | L32_64_128_256_Res_lr1e-04_d0.3 | 63.60% | No SE, lower LR |
109
- | 8 | L32_64_128_256_Res_SE_lr1e-04_d0.4 | 63.51% | Lower LR, higher dropout |
110
- | 9 | L64_128_256_Res_SE_lr5e-04_d0.4 | 63.22% | 3-layer, higher dropout |
111
- | 10 | L64_128_256_Res_SE_lr1e-04_d0.4 | 63.12% | 3-layer, lower LR |
112
-
113
- ### Key Findings
114
-
115
- | Finding | Evidence |
116
- |---------|----------|
117
- | **Residual + SE is critical** | Top 10 models all use residual connections; top 4 use both Res+SE |
118
- | **4-layer network is better** | [32,64,128,256] outperforms [64,128,256] |
119
- | **Higher LR (5e-4) preferred** | 5e-4 consistently beats 1e-4 |
120
- | **Lower dropout (0.3) preferred** | 0.3 dropout outperforms 0.4 |
121
- | **Plain CNN performs worst** | Models without Res or SE are at the bottom |
122
-
123
- ### What Worked vs What Didn't
124
-
125
- | Worked | Didn't Work |
126
- |--------|-------------|
127
- | Residual connections | Plain convolutions |
128
- | SE attention blocks | No attention |
129
- | 4 conv layers | 3 conv layers |
130
- | LR = 5e-4 | LR = 1e-4 (too slow) |
131
- | Dropout = 0.3 | Dropout = 0.4 (too aggressive) |
132
- | Focal Loss | - |
133
- | Label smoothing 0.1 | - |
134
 
135
  ---
136
 
137
- ## Data
138
 
139
- | Split | Samples | Classes | Distribution |
140
- |-------|---------|---------|--------------|
141
- | Train | ~7,000 | 3 | Balanced after augmentation |
142
- | Test | 1,044 | 3 | [348, 348, 348] |
143
 
144
- ### Image Specifications
145
 
146
- - Size: Variable (resized during training)
147
- - Channels: 3 (RGB)
148
- - Source: Colposcopy images
 
 
149
 
150
  ---
151
 
152
- ## Model Files
153
-
154
- ### Best Model Location
155
 
156
- ```
157
- ./best_model.pth (this folder)
158
- ```
159
 
160
- Original training output:
161
- ```
162
- /data/downloads/cervical_type/_output/grid_search_v2_20260117_212011/run_001_L32_64_128_256_Res_SE_lr5e-04_d0.3/
163
  ```
164
 
165
- ### Checkpoint Contents
166
 
167
  ```python
168
- {
169
- "epoch": 34,
170
- "model_state_dict": ...,
171
- "optimizer_state_dict": ...,
172
- "scheduler_state_dict": ...,
173
- "metrics": {...},
174
- "model_config": {...}
175
- }
176
- ```
177
 
178
- ### Files in This Folder
 
 
 
179
 
180
- | File | Description |
181
- |------|-------------|
182
- | `best_model.pth` | Model checkpoint (weights + optimizer state) |
183
- | `config.json` | Training configuration used |
184
- | `training_history.json` | Loss/accuracy per epoch |
185
- | `grid_search_summary.json` | All 32 grid search results |
186
- | `README.md` | This file |
187
 
188
- ### Loading the Model
 
189
 
190
- ```python
191
- import torch
 
 
 
192
 
193
- # Load checkpoint (from this folder)
194
- checkpoint = torch.load('best_model.pth', weights_only=False)
195
-
196
- # Create model with same config
197
- model = BaseCNN(
198
- conv_layers=[32, 64, 128, 256],
199
- fc_layers=[256, 128],
200
- num_classes=3,
201
- dropout=0.3,
202
- use_residual=True,
203
- use_se_attention=True
204
- )
205
-
206
- # Load weights
207
- model.load_state_dict(checkpoint['model_state_dict'])
208
- model.eval()
209
  ```
210
 
211
- ---
212
 
213
- ## Output Structure
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
214
 
215
- ```
216
- _output/
217
- └── grid_search_v2_20260117_212011/
218
- ├── grid_search_config.json # Search space definition
219
- ├── all_results.json # All 32 run results
220
- ├── summary.json # Sorted results + best run
221
- ├── logs/
222
- │ └── grid_search.log
223
- └── run_001_.../ # Best run
224
- ├── checkpoints/
225
- │ ├── best_model.pth # Best validation accuracy
226
- │ ├── latest.pth # Final epoch
227
- │ └── epoch_*.pth # Periodic saves
228
- └── logs/
229
- ├── run_config.json
230
- └── training_history.json
231
  ```
232
 
233
  ---
234
 
235
- ## Comparison with v1 Baseline
236
-
237
- | Version | Accuracy | Improvement |
238
- |---------|----------|-------------|
239
- | v1 Baseline | 61.69% | - |
240
- | **v2 Best (Res+SE)** | **65.52%** | **+3.83%** |
241
 
242
- The addition of residual connections and SE attention improved accuracy by nearly 4%.
 
 
 
 
 
 
 
243
 
244
  ---
245
 
246
- ## Recommendations for Future Work
247
 
248
- 1. **Try deeper networks** - Add 5th conv layer [32, 64, 128, 256, 512]
249
- 2. **Transfer learning** - Use pretrained EfficientNet or ResNet backbone
250
- 3. **Address Type 3 confusion** - Type 3 is often misclassified as Type 2
251
- 4. **Ensemble methods** - Combine top 3-5 models
252
- 5. **Test Time Augmentation** - Average predictions over augmented versions
253
- 6. **More training data** - Current ~7k samples may be limiting
254
 
255
  ---
256
 
257
- ## Quick Start
258
-
259
- ```bash
260
- # Run the best configuration
261
- python train_grid_v2.py
262
 
263
- # Or load and evaluate the best model
264
- python evaluate.py --model /path/to/best_model.pth
 
 
 
 
 
265
  ```
266
 
267
  ---
268
 
269
- *Last updated: January 2026*
270
- *Grid search: 32 configurations, ~15 hours on single GPU*
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - image-classification
5
+ - medical-imaging
6
+ - cervical-cancer
7
+ - pytorch
8
+ - safetensors
9
+ - cnn
10
+ datasets:
11
+ - custom
12
+ metrics:
13
+ - accuracy
14
+ - f1
15
+ pipeline_tag: image-classification
16
+ library_name: pytorch
 
 
17
  ---
18
 
19
+ # CerviGuard - Cervical Transformation Zone Classifier
 
 
 
 
20
 
21
+ ## Model Description
 
 
 
 
 
 
 
 
 
22
 
23
+ This model classifies cervical images into 3 transformation zone types, which is important for colposcopy evaluation and cervical cancer screening.
24
 
25
+ | Label | Type | Description |
26
+ |-------|------|-------------|
27
+ | 0 | Type 1 | Transformation zone fully visible on ectocervix |
28
+ | 1 | Type 2 | Transformation zone partially visible (extends into endocervical canal) |
29
+ | 2 | Type 3 | Transformation zone not visible (entirely within endocervical canal) |
 
 
 
 
30
 
31
  ---
32
 
33
+ ## Model Architecture
 
 
34
 
35
+ ### Overview
 
 
 
 
 
36
 
37
+ **BaseCNN** - A simple convolutional neural network with 4 conv blocks and 2 fully connected layers.
38
 
39
  ```
40
+ ┌─────────────────────────────────────────────────────────────┐
41
+ │ INPUT (256×256×3) │
42
+ └─────────────────────────────────────────────────────────────┘
43
+
44
+ ┌─────────────────────────────────────────────────────────────┐
45
+ │ CONV BLOCK 1 │
46
+ │ Conv2d(3→32, 3×3) → BatchNorm2d → ReLU → MaxPool2d(2×2) │
47
+ │ Output: 128×128×32 │
48
+ └─────────────────────────────────────────────────────────────┘
49
+
50
+ ┌─────────────────────────────────────────────────────────────┐
51
+ │ CONV BLOCK 2 │
52
+ │ Conv2d(32→64, 3×3) → BatchNorm2d → ReLU → MaxPool2d(2×2) │
53
+ │ Output: 64×64×64 │
54
+ └─────────────────────────────────────────────────────────────┘
55
+
56
+ ┌─────────────────────────────────────────────────────────────┐
57
+ │ CONV BLOCK 3 │
58
+ │ Conv2d(64→128, 3×3) → BatchNorm2d → ReLU → MaxPool2d(2×2) │
59
+ │ Output: 32×32×128 │
60
+ └─────────────────────────────────────────────────────────────┘
61
+
62
+ ┌─────────────────────────────────────────────────────────────┐
63
+ │ CONV BLOCK 4 │
64
+ │ Conv2d(128→256, 3×3) → BatchNorm2d → ReLU → MaxPool2d(2×2)│
65
+ │ Output: 16×16×256 │
66
+ └─────────────────────────────────────────────────────────────┘
67
+
68
+ ┌─────────────────────────────────────────────────────────────┐
69
+ │ GLOBAL POOLING │
70
+ │ AdaptiveAvgPool2d(1×1) │
71
+ │ Output: 1×1×256 → Flatten → 256 │
72
+ └─────────────────────────────────────────────────────────────┘
73
+
74
+ ┌─────────────────────────────────────────────────────────────┐
75
+ │ FC BLOCK 1 │
76
+ │ Linear(256→256) → ReLU → Dropout(0.4) │
77
+ └─────────────────────────────────────────────────────────────┘
78
+
79
+ ┌─────────────────────────────────────────────────────────────┐
80
+ │ FC BLOCK 2 │
81
+ │ Linear(256→128) → ReLU → Dropout(0.4) │
82
+ └─────────────────────────────────────────────────────────────┘
83
+
84
+ ┌─────────────────────────────────────────────────────────────┐
85
+ │ CLASSIFIER │
86
+ │ Linear(128→3) │
87
+ └─────────────────────────────────────────────────────────────┘
88
+
89
+ ┌─────────────────────────────────────────────────────────────┐
90
+ │ OUTPUT (3 logits) │
91
+ │ [Type 1, Type 2, Type 3] │
92
+ └─────────────────────────────────────────────────────────────┘
93
  ```
94
 
95
+ ### Layer Details
96
+
97
+ | Layer | Type | In Channels | Out Channels | Kernel | Output Size |
98
+ |-------|------|-------------|--------------|--------|-------------|
99
+ | conv_layers.0 | Conv2d | 3 | 32 | 3×3 | 256×256×32 |
100
+ | conv_layers.1 | BatchNorm2d | 32 | 32 | - | 256×256×32 |
101
+ | conv_layers.2 | ReLU | - | - | - | 256×256×32 |
102
+ | conv_layers.3 | MaxPool2d | - | - | 2×2 | 128×128×32 |
103
+ | conv_layers.4 | Conv2d | 32 | 64 | 3×3 | 128×128×64 |
104
+ | conv_layers.5 | BatchNorm2d | 64 | 64 | - | 128×128×64 |
105
+ | conv_layers.6 | ReLU | - | - | - | 128×128×64 |
106
+ | conv_layers.7 | MaxPool2d | - | - | 2×2 | 64×64×64 |
107
+ | conv_layers.8 | Conv2d | 64 | 128 | 3×3 | 64×64×128 |
108
+ | conv_layers.9 | BatchNorm2d | 128 | 128 | - | 64×64×128 |
109
+ | conv_layers.10 | ReLU | - | - | - | 64×64×128 |
110
+ | conv_layers.11 | MaxPool2d | - | - | 2×2 | 32×32×128 |
111
+ | conv_layers.12 | Conv2d | 128 | 256 | 3×3 | 32×32×256 |
112
+ | conv_layers.13 | BatchNorm2d | 256 | 256 | - | 32×32×256 |
113
+ | conv_layers.14 | ReLU | - | - | - | 32×32×256 |
114
+ | conv_layers.15 | MaxPool2d | - | - | 2×2 | 16×16×256 |
115
+ | adaptive_pool | AdaptiveAvgPool2d | - | - | - | 1×1×256 |
116
+ | fc_layers.0 | Linear | 256 | 256 | - | 256 |
117
+ | fc_layers.1 | ReLU | - | - | - | 256 |
118
+ | fc_layers.2 | Dropout | - | - | p=0.4 | 256 |
119
+ | fc_layers.3 | Linear | 256 | 128 | - | 128 |
120
+ | fc_layers.4 | ReLU | - | - | - | 128 |
121
+ | fc_layers.5 | Dropout | - | - | p=0.4 | 128 |
122
+ | classifier | Linear | 128 | 3 | - | 3 |
123
+
124
+ ### Model Summary
125
+
126
+ | Property | Value |
127
+ |----------|-------|
128
+ | **Total Parameters** | 488,451 |
129
+ | **Trainable Parameters** | 488,451 |
130
+ | **Input Size** | (B, 3, 256, 256) |
131
+ | **Output Size** | (B, 3) |
132
+ | **Model Size** | ~1.9 MB |
133
 
134
  ---
135
 
136
+ ## Training Configuration
137
+
138
+ | Parameter | Value |
139
+ |-----------|-------|
140
+ | Learning Rate | 1e-4 |
141
+ | Batch Size | 32 |
142
+ | Dropout | 0.4 |
143
+ | Optimizer | Adam |
144
+ | Loss Function | CrossEntropyLoss |
145
+ | Epochs | 50 |
146
+ | Best Epoch | 41 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
147
 
148
  ---
149
 
150
+ ## Performance
151
 
152
+ | Metric | Value |
153
+ |--------|-------|
154
+ | **Validation Accuracy** | 61.69% |
155
+ | **Macro F1 Score** | 61.81% |
156
 
157
+ ### Per-Class Performance
158
 
159
+ | Type | Precision | Recall | F1 Score |
160
+ |------|-----------|--------|----------|
161
+ | Type 1 | - | - | 68.32% |
162
+ | Type 2 | - | - | 56.41% |
163
+ | Type 3 | - | - | 60.69% |
164
 
165
  ---
166
 
167
+ ## Usage
 
 
168
 
169
+ ### Installation
 
 
170
 
171
+ ```bash
172
+ pip install torch torchvision safetensors huggingface_hub
 
173
  ```
174
 
175
+ ### Quick Start (Local)
176
 
177
  ```python
178
+ import torch
179
+ from PIL import Image
180
+ from torchvision import transforms
 
 
 
 
 
 
181
 
182
+ # Load model
183
+ from model import BaseCNN
184
+ model = BaseCNN.from_pretrained("./")
185
+ model.eval()
186
 
187
+ # Preprocess image
188
+ transform = transforms.Compose([
189
+ transforms.Resize((256, 256)),
190
+ transforms.ToTensor(),
191
+ ])
 
 
192
 
193
+ image = Image.open("cervical_image.jpg").convert("RGB")
194
+ input_tensor = transform(image).unsqueeze(0)
195
 
196
+ # Inference
197
+ with torch.no_grad():
198
+ output = model(input_tensor)
199
+ probabilities = torch.softmax(output, dim=1)
200
+ prediction = output.argmax(dim=1).item()
201
 
202
+ labels = ["Type 1", "Type 2", "Type 3"]
203
+ print(f"Prediction: {labels[prediction]}")
204
+ print(f"Confidence: {probabilities[0][prediction]:.2%}")
 
 
 
 
 
 
 
 
 
 
 
 
 
205
  ```
206
 
207
+ ### Load from Hugging Face Hub
208
 
209
+ ```python
210
+ from huggingface_hub import hf_hub_download
211
+ from safetensors.torch import load_file
212
+ import torch
213
+ import json
214
+ import importlib.util
215
+
216
+ # Download files
217
+ repo_id = "toderian/cerviguard_transfer_zones"
218
+ model_weights = hf_hub_download(repo_id, "model.safetensors")
219
+ config_file = hf_hub_download(repo_id, "config.json")
220
+ model_file = hf_hub_download(repo_id, "model.py")
221
+
222
+ # Load model class dynamically
223
+ spec = importlib.util.spec_from_file_location("model", model_file)
224
+ model_module = importlib.util.module_from_spec(spec)
225
+ spec.loader.exec_module(model_module)
226
+
227
+ # Load config and create model
228
+ with open(config_file) as f:
229
+ config = json.load(f)
230
+
231
+ model = model_module.BaseCNN(**config['model_config'])
232
+ model.load_state_dict(load_file(model_weights))
233
+ model.eval()
234
 
235
+ # Now use model for inference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
236
  ```
237
 
238
  ---
239
 
240
+ ## Files in This Repository
 
 
 
 
 
241
 
242
+ | File | Description |
243
+ |------|-------------|
244
+ | `model.safetensors` | Model weights (SafeTensors format, recommended) |
245
+ | `pytorch_model.bin` | Model weights (PyTorch format, backup) |
246
+ | `config.json` | Model architecture configuration |
247
+ | `model.py` | Model class definition (BaseCNN) |
248
+ | `preprocessor_config.json` | Image preprocessing configuration |
249
+ | `README.md` | This model card |
250
 
251
  ---
252
 
253
+ ## Limitations
254
 
255
+ - Model was trained on a specific dataset and may not generalize to all cervical imaging equipment
256
+ - Type 2 classification has lower accuracy (56.41% F1) as it represents an intermediate state
257
+ - Input images should be 256×256 RGB
258
+ - This is a custom PyTorch model, not compatible with `transformers.AutoModel`
 
 
259
 
260
  ---
261
 
262
+ ## Citation
 
 
 
 
263
 
264
+ ```bibtex
265
+ @misc{cerviguard-transfer-zones,
266
+ title={CerviGuard Cervical Transformation Zone Classifier},
267
+ author={toderian},
268
+ year={2026},
269
+ howpublished={\url{https://huggingface.co/toderian/cerviguard_transfer_zones}}
270
+ }
271
  ```
272
 
273
  ---
274
 
275
+ ## License
276
+
277
+ MIT License
config.json CHANGED
@@ -1,26 +1,36 @@
1
  {
2
- "batch_size": 32,
3
- "learning_rate": 0.0005,
4
- "weight_decay": 0.0001,
5
- "layers": [
6
- 32,
7
- 64,
8
- 128,
9
- 256
10
- ],
11
- "use_residual": true,
12
- "use_se_attention": true,
13
- "focal_gamma": 2.0,
14
- "label_smoothing": 0.1,
15
- "dropout": 0.3,
16
- "kernel": 3,
17
- "batchnorm": true,
18
- "activation": "ReLU",
19
- "pool": true,
20
- "fc_multipliers": [
21
- 1.0,
22
- 0.5
23
- ],
24
- "nr_classes": 3,
25
- "augmentation": true
 
 
 
 
 
 
 
 
 
 
26
  }
 
1
  {
2
+ "model_type": "BaseCNN",
3
+ "model_config": {
4
+ "layers": [
5
+ 32,
6
+ 64,
7
+ 128,
8
+ 256
9
+ ],
10
+ "kernel": 3,
11
+ "padding": 1,
12
+ "stride": 1,
13
+ "batchnorm": true,
14
+ "bn_pre_activ": true,
15
+ "activation": "ReLU",
16
+ "dropout": 0.4,
17
+ "pool": true,
18
+ "fc_layers": [
19
+ 256,
20
+ 128
21
+ ],
22
+ "nr_classes": 3,
23
+ "in_channels": 3
24
+ },
25
+ "num_labels": 3,
26
+ "id2label": {
27
+ "0": "Type 1",
28
+ "1": "Type 2",
29
+ "2": "Type 3"
30
+ },
31
+ "label2id": {
32
+ "Type 1": 0,
33
+ "Type 2": 1,
34
+ "Type 3": 2
35
+ }
36
  }
model.py ADDED
@@ -0,0 +1,244 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Cervical Type Classification Model
3
+
4
+ This module contains the BaseCNN model for classifying cervical images
5
+ into 3 transformation zone types.
6
+
7
+ Usage:
8
+ from model import BaseCNN
9
+
10
+ # Load pretrained model
11
+ model = BaseCNN.from_pretrained("./")
12
+
13
+ # Or create from scratch
14
+ model = BaseCNN(
15
+ layers=[32, 64, 128, 256],
16
+ fc_layers=[256, 128],
17
+ nr_classes=3
18
+ )
19
+ """
20
+
21
+ import json
22
+ from pathlib import Path
23
+
24
+ import torch
25
+ import torch.nn as nn
26
+
27
+ try:
28
+ from safetensors.torch import load_file, save_file
29
+ HAS_SAFETENSORS = True
30
+ except ImportError:
31
+ HAS_SAFETENSORS = False
32
+
33
+
34
+ class BaseCNN(nn.Module):
35
+ """
36
+ Simple CNN for cervical type classification.
37
+
38
+ Classifies cervical images into 3 transformation zone types:
39
+ - Type 1: Transformation zone fully visible on ectocervix
40
+ - Type 2: Transformation zone partially visible
41
+ - Type 3: Transformation zone not visible (within endocervical canal)
42
+
43
+ Args:
44
+ layers: List of output channels for each conv layer. Default: [32, 64, 128, 256]
45
+ kernel: Kernel size for conv layers. Default: 3
46
+ padding: Padding for conv layers. Default: 1
47
+ stride: Stride for conv layers. Default: 1
48
+ batchnorm: Whether to use batch normalization. Default: True
49
+ bn_pre_activ: Whether to apply BN before activation. Default: True
50
+ activation: Activation function name. Default: 'ReLU'
51
+ dropout: Dropout rate for FC layers. Default: 0.4
52
+ pool: Whether to use max pooling after each conv. Default: True
53
+ fc_layers: List of FC layer sizes. Default: [256, 128]
54
+ nr_classes: Number of output classes. Default: 3
55
+ in_channels: Number of input channels. Default: 3
56
+ """
57
+
58
+ def __init__(
59
+ self,
60
+ layers: list = None,
61
+ kernel: int = 3,
62
+ padding: int = 1,
63
+ stride: int = 1,
64
+ batchnorm: bool = True,
65
+ bn_pre_activ: bool = True,
66
+ activation: str = 'ReLU',
67
+ dropout: float = 0.4,
68
+ pool: bool = True,
69
+ fc_layers: list = None,
70
+ nr_classes: int = 3,
71
+ in_channels: int = 3,
72
+ ):
73
+ super().__init__()
74
+
75
+ # Store config for serialization
76
+ self.config = {
77
+ 'layers': layers or [32, 64, 128, 256],
78
+ 'kernel': kernel,
79
+ 'padding': padding,
80
+ 'stride': stride,
81
+ 'batchnorm': batchnorm,
82
+ 'bn_pre_activ': bn_pre_activ,
83
+ 'activation': activation,
84
+ 'dropout': dropout,
85
+ 'pool': pool,
86
+ 'fc_layers': fc_layers or [256, 128],
87
+ 'nr_classes': nr_classes,
88
+ 'in_channels': in_channels,
89
+ }
90
+
91
+ layers = self.config['layers']
92
+ fc_layers = self.config['fc_layers']
93
+
94
+ # Activation function
95
+ activation_fn = getattr(nn, activation)
96
+
97
+ # Build convolutional layers (ModuleList to match original)
98
+ self.conv_layers = nn.ModuleList()
99
+ prev_channels = in_channels
100
+
101
+ for out_channels in layers:
102
+ self.conv_layers.append(
103
+ nn.Conv2d(prev_channels, out_channels, kernel, stride, padding)
104
+ )
105
+ if batchnorm and bn_pre_activ:
106
+ self.conv_layers.append(nn.BatchNorm2d(out_channels))
107
+ self.conv_layers.append(activation_fn())
108
+ if batchnorm and not bn_pre_activ:
109
+ self.conv_layers.append(nn.BatchNorm2d(out_channels))
110
+ if pool:
111
+ self.conv_layers.append(nn.MaxPool2d(2, 2))
112
+ prev_channels = out_channels
113
+
114
+ # Global average pooling
115
+ self.adaptive_pool = nn.AdaptiveAvgPool2d(1)
116
+
117
+ # Build fully connected layers (ModuleList to match original)
118
+ self.fc_layers = nn.ModuleList()
119
+ prev_features = layers[-1]
120
+
121
+ for fc_size in fc_layers:
122
+ self.fc_layers.append(nn.Linear(prev_features, fc_size))
123
+ self.fc_layers.append(activation_fn())
124
+ self.fc_layers.append(nn.Dropout(dropout))
125
+ prev_features = fc_size
126
+
127
+ # Final classifier (separate, to match original)
128
+ self.classifier = nn.Linear(prev_features, nr_classes)
129
+
130
+ def forward(self, x: torch.Tensor) -> torch.Tensor:
131
+ """
132
+ Forward pass.
133
+
134
+ Args:
135
+ x: Input tensor of shape (batch_size, 3, 256, 256)
136
+
137
+ Returns:
138
+ Logits tensor of shape (batch_size, num_classes)
139
+ """
140
+ for layer in self.conv_layers:
141
+ x = layer(x)
142
+
143
+ x = self.adaptive_pool(x)
144
+ x = x.view(x.size(0), -1)
145
+
146
+ for layer in self.fc_layers:
147
+ x = layer(x)
148
+
149
+ x = self.classifier(x)
150
+ return x
151
+
152
+ @classmethod
153
+ def from_pretrained(cls, model_path: str, device: str = 'cpu') -> 'BaseCNN':
154
+ """
155
+ Load a pretrained model from a directory.
156
+
157
+ Args:
158
+ model_path: Path to directory containing model files
159
+ device: Device to load model on ('cpu' or 'cuda')
160
+
161
+ Returns:
162
+ Loaded model in eval mode
163
+ """
164
+ model_path = Path(model_path)
165
+
166
+ # Load config
167
+ config_path = model_path / 'config.json'
168
+ with open(config_path, 'r') as f:
169
+ config = json.load(f)
170
+
171
+ # Create model
172
+ model = cls(**config['model_config'])
173
+
174
+ # Load weights (prefer safetensors)
175
+ safetensors_path = model_path / 'model.safetensors'
176
+ pytorch_path = model_path / 'pytorch_model.bin'
177
+
178
+ if safetensors_path.exists() and HAS_SAFETENSORS:
179
+ state_dict = load_file(str(safetensors_path), device=device)
180
+ elif pytorch_path.exists():
181
+ state_dict = torch.load(pytorch_path, map_location=device, weights_only=True)
182
+ else:
183
+ raise FileNotFoundError(f"No model weights found in {model_path}")
184
+
185
+ model.load_state_dict(state_dict)
186
+ model.to(device)
187
+ model.eval()
188
+ return model
189
+
190
+ def save_pretrained(self, save_path: str) -> None:
191
+ """
192
+ Save model in Hugging Face compatible format.
193
+
194
+ Args:
195
+ save_path: Directory to save model files
196
+ """
197
+ save_path = Path(save_path)
198
+ save_path.mkdir(parents=True, exist_ok=True)
199
+
200
+ # Save config
201
+ config = {
202
+ 'model_type': 'BaseCNN',
203
+ 'model_config': self.config,
204
+ 'num_labels': self.config['nr_classes'],
205
+ 'id2label': {
206
+ '0': 'Type 1',
207
+ '1': 'Type 2',
208
+ '2': 'Type 3'
209
+ },
210
+ 'label2id': {
211
+ 'Type 1': 0,
212
+ 'Type 2': 1,
213
+ 'Type 3': 2
214
+ }
215
+ }
216
+ with open(save_path / 'config.json', 'w') as f:
217
+ json.dump(config, f, indent=2)
218
+
219
+ # Save weights
220
+ state_dict = {k: v.contiguous() for k, v in self.state_dict().items()}
221
+
222
+ # SafeTensors format (recommended)
223
+ if HAS_SAFETENSORS:
224
+ save_file(state_dict, str(save_path / 'model.safetensors'))
225
+
226
+ # PyTorch format (backup)
227
+ torch.save(state_dict, save_path / 'pytorch_model.bin')
228
+
229
+
230
+ # Label mappings
231
+ ID2LABEL = {0: 'Type 1', 1: 'Type 2', 2: 'Type 3'}
232
+ LABEL2ID = {'Type 1': 0, 'Type 2': 1, 'Type 3': 2}
233
+
234
+
235
+ if __name__ == '__main__':
236
+ # Quick test
237
+ model = BaseCNN()
238
+ print(f"Model parameters: {sum(p.numel() for p in model.parameters()):,}")
239
+
240
+ # Test forward pass
241
+ x = torch.randn(1, 3, 256, 256)
242
+ y = model(x)
243
+ print(f"Input shape: {x.shape}")
244
+ print(f"Output shape: {y.shape}")
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:beb3e17da6b94596232aa18078b9d22872f4711c7c1ef21a35f3277175d14063
3
+ size 1960588
preprocessor_config.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "do_normalize": true,
3
+ "do_rescale": true,
4
+ "do_resize": true,
5
+ "image_mean": [0.5, 0.5, 0.5],
6
+ "image_std": [0.5, 0.5, 0.5],
7
+ "image_processor_type": "ImageProcessor",
8
+ "resample": 3,
9
+ "rescale_factor": 0.00392156862745098,
10
+ "size": {
11
+ "height": 256,
12
+ "width": 256
13
+ }
14
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3d88b2345cde9dcdc7fc2b8ba76edb2c64abfbc274f320bd55ad9e12801c9b00
3
+ size 1969453