Sentoz commited on
Commit
3e93e14
·
verified ·
1 Parent(s): 37aa6b6

Deploy KidneyDL CT Scan Classifier

Browse files
.dockerignore ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Git
2
+ .git
3
+ .gitignore
4
+
5
+ # DVC internals
6
+ .dvc/cache
7
+ .dvc/tmp
8
+
9
+ # CT scan training images are large and not needed inside the container.
10
+ # The trained model at artifacts/training/model.h5 is kept so the container
11
+ # can serve predictions without a volume mount.
12
+ artifacts/data_ingestion/
13
+
14
+ # Python caches
15
+ __pycache__
16
+ *.py[cod]
17
+ *.pyo
18
+ *.pyd
19
+ .Python
20
+
21
+ # Virtual environments and Conda
22
+ .venv
23
+ venv
24
+ env
25
+ *.egg-info
26
+ dist
27
+ build
28
+
29
+ # Jupyter notebooks (not needed in the container)
30
+ research/
31
+ *.ipynb
32
+ .ipynb_checkpoints
33
+
34
+ # Logs
35
+ logs/
36
+ *.log
37
+
38
+ # Secrets (never bake credentials into the image)
39
+ .env
40
+
41
+ # Test and dev artifacts
42
+ uploads/
43
+ scores.json
44
+
45
+ # Editor and OS noise
46
+ .vscode
47
+ .idea
48
+ *.DS_Store
49
+ Thumbs.db
.dvc/.gitignore ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ /config.local
2
+ /tmp
3
+ /cache
.dvc/config ADDED
File without changes
.dvcignore ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ # Add patterns of files dvc should ignore, which could improve
2
+ # the performance. Learn more at
3
+ # https://dvc.org/doc/user-guide/dvcignore
Dockerfile ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.10-slim
2
+
3
+ # Keeps Python output unbuffered so logs appear immediately in Docker
4
+ ENV PYTHONUNBUFFERED=1 \
5
+ PYTHONUTF8=1 \
6
+ PIP_NO_CACHE_DIR=1 \
7
+ PIP_DISABLE_PIP_VERSION_CHECK=1
8
+
9
+ WORKDIR /app
10
+
11
+ # System libraries required by TensorFlow, OpenCV, and image processing
12
+ RUN apt-get update && apt-get install -y --no-install-recommends \
13
+ libglib2.0-0 \
14
+ libsm6 \
15
+ libxrender1 \
16
+ libxext6 \
17
+ libgl1 \
18
+ && rm -rf /var/lib/apt/lists/*
19
+
20
+ # Install Python dependencies first so this layer is cached between code changes
21
+ COPY requirements.txt .
22
+ RUN pip install --upgrade pip && \
23
+ pip install -r requirements.txt
24
+
25
+ # Copy the full project and install the cnnClassifier package
26
+ COPY . .
27
+ RUN pip install -e .
28
+
29
+ # Directory for uploaded scan images at runtime
30
+ RUN mkdir -p uploads
31
+
32
+ EXPOSE 7860
33
+
34
+ CMD ["python", "app.py"]
README.md CHANGED
@@ -1,11 +1,347 @@
1
- ---
2
- title: Kidney Classifier
3
- emoji: 🌍
4
- colorFrom: blue
5
- colorTo: gray
6
- sdk: docker
7
- pinned: false
8
- license: mit
9
- ---
10
-
11
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: KidneyDL CT Scan Classifier
3
+ emoji: 🫁
4
+ colorFrom: blue
5
+ colorTo: indigo
6
+ sdk: docker
7
+ app_port: 7860
8
+ pinned: true
9
+ license: mit
10
+ ---
11
+
12
+ ## KidneyAI: End-to-End Kidney CT Scan Classification with MLOps
13
+
14
+ [![Python](https://img.shields.io/badge/Python-3.13-blue?logo=python&logoColor=white)](https://www.python.org/)
15
+ [![TensorFlow](https://img.shields.io/badge/TensorFlow-2.x-orange?logo=tensorflow&logoColor=white)](https://www.tensorflow.org/)
16
+ [![DVC](https://img.shields.io/badge/DVC-Pipeline%20Versioning-945DD6?logo=dvc&logoColor=white)](https://dvc.org/)
17
+ [![MLflow](https://img.shields.io/badge/MLflow-Experiment%20Tracking-0194E2?logo=mlflow&logoColor=white)](https://mlflow.org/)
18
+ [![DagsHub](https://img.shields.io/badge/DagsHub-Remote%20Tracking-FF6B35?logoColor=white)](https://dagshub.com/)
19
+ [![Flask](https://img.shields.io/badge/Flask-Web%20App-000000?logo=flask&logoColor=white)](https://flask.palletsprojects.com/)
20
+ [![Docker](https://img.shields.io/badge/Docker-Containerised-2496ED?logo=docker&logoColor=white)](https://www.docker.com/)
21
+
22
+ ---
23
+
24
+ ## What This Project Is
25
+
26
+ This is a production-style, end-to-end machine learning project that classifies kidney CT scan images as either **Normal** or **Tumor**. But the model itself is only one piece of the story. The real focus of this project is everything that surrounds it: a fully reproducible DVC pipeline, experiment tracking with MLflow and DagsHub, a clean configuration-driven codebase, a Flask web application, and a Dockerised deployment setup.
27
+
28
+ It was built to demonstrate what a real MLOps workflow looks like in practice, not just the notebook that produces a metric, but the entire system that allows a model to be trained, evaluated, versioned, and served reliably.
29
+
30
+ ---
31
+
32
+ ## The Problem
33
+
34
+ Kidney disease is among the leading causes of death globally, and it often goes undetected until its later stages when treatment options become limited. Radiologists manually reviewing CT scans are under enormous pressure, and any tool that can reliably flag suspicious scans for closer attention has genuine clinical value.
35
+
36
+ This project builds a binary image classifier that can look at a kidney CT scan and tell you, within seconds, whether the kidney appears normal or shows signs of a tumor. It is trained on a labelled CT scan dataset and achieves approximately **89.9% validation accuracy** using a fine-tuned VGG16 network.
37
+
38
+ ---
39
+
40
+ ## Why VGG16?
41
+
42
+ VGG16 was selected deliberately, not arbitrarily. Here is the reasoning:
43
+
44
+ Its architecture is built from uniform 3x3 convolutional layers stacked into increasing depth. This design is especially good at learning fine-grained local textures, which is critical in medical imaging where the difference between healthy and abnormal tissue often comes down to subtle structural patterns rather than large-scale shape differences.
45
+
46
+ Pre-trained on ImageNet, VGG16 already knows how to see. Its lower layers encode general-purpose feature detectors for edges, corners, and textures. Those weights do not need to be learned from scratch. Only the top classification layers need to be adapted to the kidney scan domain, which means the model can achieve strong performance with far less labelled data than training from scratch would require.
47
+
48
+ It is also a stable, well-understood architecture. In a medical context, that matters. The behaviour of the model is predictable, and the features it learns can be interpreted through tools like Grad-CAM.
49
+
50
+ ---
51
+
52
+ ## Model Performance
53
+
54
+ | Metric | Value |
55
+ |----------|--------|
56
+ | Accuracy | 89.9% |
57
+ | Loss | 1.26 |
58
+
59
+ Metrics are logged automatically to MLflow after every pipeline run. You can view all experiment runs, compare parameters, and download model artifacts directly from the DagsHub MLflow UI.
60
+
61
+ ---
62
+
63
+ ## Project Structure
64
+
65
+ ```text
66
+ Kidney_classification_Using_MLOPS_and_DVC/
67
+
68
+ ├── config/
69
+ │ └── config.yaml Central path and artifact configuration
70
+
71
+ ├── params.yaml All model hyperparameters in one place
72
+ ├── dvc.yaml DVC pipeline stage definitions
73
+ ├── dvc.lock DVC lock file tracking stage state
74
+ ├── main.py Runs all pipeline stages sequentially
75
+ ├── app.py Flask web application
76
+ ├── Dockerfile Container definition for the prediction server
77
+ ├── requirements.txt Python dependencies
78
+ ├── setup.py Installable package definition
79
+ ├── scores.json Latest evaluation metrics
80
+
81
+ ├── src/cnnClassifier/
82
+ │ ├── __init__.py Logger setup
83
+ │ ├── constants/ Project-wide constants (config file paths)
84
+ │ ├── entity/
85
+ │ │ └── config_entity.py Typed dataclasses for each pipeline stage config
86
+ │ ├── config/
87
+ │ │ └── configuration.py ConfigurationManager: reads YAML and builds configs
88
+ │ ├── utils/
89
+ │ │ └── common.py Shared utilities: YAML reading, directory creation, JSON saving
90
+ │ ├── components/
91
+ │ │ ├── data_ingestion.py Downloads and extracts the dataset
92
+ │ │ ├── prepare_base_model.py Loads VGG16 and adds the classification head
93
+ │ │ ├── model_trainer.py Trains the model with augmentation support
94
+ │ │ └── model_evaluation_mlflow.py Evaluates and logs to MLflow via DagsHub
95
+ │ └── pipeline/
96
+ │ ├── stage_01_data_ingestion.py
97
+ │ ├── stage_02_prepare_base_model.py
98
+ │ ├── stage_03_model_trainer.py
99
+ │ ├── stage_04_model_evaluation.py
100
+ │ └── prediction.py Prediction pipeline used by the Flask app
101
+
102
+ ├── research/
103
+ │ ├── 01_data_ingestion.ipynb
104
+ │ ├── 02_prepare_base_model.ipynb
105
+ │ ├── 03_model_trainer.ipynb
106
+ │ └── 04_model_evaluation.ipynb Each stage was prototyped here first
107
+
108
+ └── templates/
109
+ └── index.html Web UI for the prediction app
110
+ ```
111
+
112
+ ---
113
+
114
+ ## The ML Pipeline
115
+
116
+ The pipeline has four stages, each defined in `dvc.yaml` and executed in order by DVC.
117
+
118
+ ```text
119
+ Stage 1 Stage 2 Stage 3 Stage 4
120
+ Data Ingestion Base Model Preparation Model Training Model Evaluation
121
+ ```
122
+
123
+ ### Stage 1: Data Ingestion
124
+
125
+ Downloads the kidney CT scan dataset from Google Drive using `gdown`, extracts the zip archive, and places the images into the `artifacts/data_ingestion/` directory. DVC tracks the output so this stage is skipped if the data already exists and nothing has changed.
126
+
127
+ ### Stage 2: Base Model Preparation
128
+
129
+ Loads VGG16 with ImageNet weights and without its top classification layers. Adds a custom head: a global average pooling layer followed by a dense output layer with softmax activation for the two classes, Normal and Tumor. The base VGG16 layers are frozen. The resulting model is saved to disk so the training stage can pick it up.
130
+
131
+ ### Stage 3: Model Training
132
+
133
+ Loads the prepared base model, recompiles it with an SGD optimiser, and trains it on the kidney CT images. Supports data augmentation (horizontal flip, zoom, shear) to improve generalisation. The trained model is saved as `artifacts/training/model.h5`.
134
+
135
+ ### Stage 4: Model Evaluation
136
+
137
+ Loads the trained model and evaluates it against the 30 percent validation split. Loss and accuracy are saved to `scores.json` and logged to MLflow. The model is also registered in the MLflow Model Registry under the name `VGG16Model`.
138
+
139
+ ---
140
+
141
+ ## Experiment Tracking with MLflow and DagsHub
142
+
143
+ All runs are tracked remotely on DagsHub, which acts as the MLflow tracking server. Every time the evaluation stage runs, it logs:
144
+
145
+ - All hyperparameters from `params.yaml`
146
+ - Validation loss and accuracy
147
+ - The trained model as an MLflow artifact
148
+ - A registered model version in the MLflow Model Registry
149
+
150
+ You can view the experiment runs at:
151
+ [https://dagshub.com/sentongo-web/Kidney_classification_Using_MLOPS_and_DVC_Data-version-control.mlflow](https://dagshub.com/sentongo-web/Kidney_classification_Using_MLOPS_and_DVC_Data-version-control.mlflow)
152
+
153
+ ---
154
+
155
+ ## Configuration
156
+
157
+ Everything is driven by two YAML files. There are no hardcoded paths or hyperparameters anywhere in the source code.
158
+
159
+ **`config/config.yaml`** manages all file paths and artifact locations:
160
+
161
+ ```yaml
162
+ artifacts_root: artifacts
163
+
164
+ data_ingestion:
165
+ root_dir: artifacts/data_ingestion
166
+ source_URL: "https://drive.google.com/file/d/16PZpADG4Pl_SBr2E3DEcvXsLQ5DSUtDP/view?usp=sharing"
167
+ local_data_file: artifacts/data_ingestion/data.zip
168
+ unzip_dir: artifacts/data_ingestion
169
+
170
+ prepare_base_model:
171
+ root_dir: artifacts/prepare_base_model
172
+ base_model_path: artifacts/prepare_base_model/base_model.h5
173
+ updated_base_model_path: artifacts/prepare_base_model/base_model_updated.h5
174
+
175
+ training:
176
+ root_dir: artifacts/training
177
+ trained_model_path: artifacts/training/model.h5
178
+
179
+ evaluation:
180
+ path_of_model: artifacts/training/model.h5
181
+ training_data: artifacts/data_ingestion/kidney-ct-scan-image
182
+ mlflow_uri: "https://dagshub.com/sentongo-web/Kidney_classification_Using_MLOPS_and_DVC_Data-version-control.mlflow"
183
+ all_params:
184
+ AUGMENTATION: True
185
+ IMAGE_SIZE: [224, 224, 3]
186
+ BATCH_SIZE: 16
187
+ INCLUDE_TOP: False
188
+ EPOCHS: 5
189
+ CLASSES: 2
190
+ WEIGHTS: imagenet
191
+ LEARNING_RATE: 0.01
192
+ ```
193
+
194
+ **`params.yaml`** is where all model hyperparameters live:
195
+
196
+ ```yaml
197
+ AUGMENTATION: True
198
+ IMAGE_SIZE: [224, 224, 3]
199
+ BATCH_SIZE: 16
200
+ INCLUDE_TOP: False
201
+ EPOCHS: 5
202
+ CLASSES: 2
203
+ WEIGHTS: imagenet
204
+ LEARNING_RATE: 0.01
205
+ ```
206
+
207
+ ---
208
+
209
+ ## How to Run Locally
210
+
211
+ ### 1. Clone the repository
212
+
213
+ ```bash
214
+ git clone https://github.com/sentongo-web/Kidney_classification_Using_MLOPS_and_DVC_Data-version-control.git
215
+ cd Kidney_classification_Using_MLOPS_and_DVC_Data-version-control
216
+ ```
217
+
218
+ ### 2. Create and activate a Conda environment
219
+
220
+ ```bash
221
+ conda create -n kidney python=3.13 -y
222
+ conda activate kidney
223
+ ```
224
+
225
+ ### 3. Install dependencies
226
+
227
+ ```bash
228
+ pip install -r requirements.txt
229
+ pip install -e .
230
+ ```
231
+
232
+ ### 4. Set up your MLflow credentials
233
+
234
+ Create a `.env` file in the project root with your DagsHub token:
235
+
236
+ ```env
237
+ MLFLOW_TRACKING_USERNAME=your_dagshub_username
238
+ MLFLOW_TRACKING_PASSWORD=your_dagshub_token
239
+ ```
240
+
241
+ This file is gitignored and will never be committed.
242
+
243
+ ### 5. Run the full pipeline
244
+
245
+ ```bash
246
+ dvc repro
247
+ ```
248
+
249
+ DVC will execute all four stages in order. If any stage has already run and its inputs have not changed, it will be skipped automatically. After the pipeline finishes, `scores.json` will contain the latest evaluation metrics.
250
+
251
+ ### 6. Launch the web application
252
+
253
+ ```bash
254
+ python app.py
255
+ ```
256
+
257
+ Open your browser and go to `http://localhost:8080`. You can upload a kidney CT scan image and get a classification result instantly.
258
+
259
+ ### 7. View experiment runs
260
+
261
+ ```bash
262
+ mlflow ui
263
+ ```
264
+
265
+ Open `http://localhost:5000` to browse all local experiment runs, or visit the DagsHub MLflow URL above to see all remotely tracked runs.
266
+
267
+ ---
268
+
269
+ ## Run with Docker
270
+
271
+ ```bash
272
+ docker build -t kidney-classifier .
273
+ docker run -p 8080:8080 kidney-classifier
274
+ ```
275
+
276
+ Open `http://localhost:8080` in your browser.
277
+
278
+ ---
279
+
280
+ ## The Web Application
281
+
282
+ The Flask app exposes three routes:
283
+
284
+ | Route | Method | Description |
285
+ | --------- | ------ | ------------------------------------------------------------------- |
286
+ | `/` | GET | Serves the prediction web UI |
287
+ | `/predict`| POST | Accepts an image file and returns the classification result as JSON |
288
+ | `/train` | GET | Reruns `main.py` to retrain the model from scratch |
289
+
290
+ The prediction endpoint returns a response like this:
291
+
292
+ ```json
293
+ [{"image": "Normal"}]
294
+ ```
295
+
296
+ or
297
+
298
+ ```json
299
+ [{"image": "Tumor"}]
300
+ ```
301
+
302
+ The UI supports drag and drop, shows a live preview of the uploaded scan, displays the result with a confidence bar, and works in both light and dark mode with automatic detection of your system preference.
303
+
304
+ ---
305
+
306
+ ## Tech Stack
307
+
308
+ | Area | Tools |
309
+ | ------------------- | ------------------------------------------------- |
310
+ | Deep Learning | TensorFlow and Keras with VGG16 transfer learning |
311
+ | Data Versioning | DVC |
312
+ | Experiment Tracking | MLflow hosted on DagsHub |
313
+ | Web Framework | Flask with Flask-CORS |
314
+ | Data Processing | NumPy, Pandas, scikit-learn |
315
+ | Configuration | PyYAML and python-box |
316
+ | Package Management | setuptools with src layout, editable install |
317
+ | Containerisation | Docker |
318
+ | Environment | Conda with pip |
319
+
320
+ ---
321
+
322
+ ## MLOps Concepts Demonstrated
323
+
324
+ | Concept | How it is implemented |
325
+ | ------------------------ | ------------------------------------------------------------------------------- |
326
+ | Data versioning | DVC tracks the dataset and all model artifacts |
327
+ | Pipeline as code | `dvc.yaml` defines every stage and its dependencies |
328
+ | Incremental execution | DVC only reruns stages whose inputs have changed |
329
+ | Experiment tracking | MLflow logs parameters, metrics, and model artifacts on every run |
330
+ | Model registry | Trained models are registered and versioned in the MLflow Model Registry |
331
+ | Configuration management | All paths and hyperparameters live in YAML files with no hardcoded values |
332
+ | Modular ML package | Source code is structured as an installable Python package |
333
+ | Reproducibility | Any contributor can clone the repo and run `dvc repro` to get identical results |
334
+ | Containerisation | Dockerfile ensures the app runs consistently in any environment |
335
+ | REST API serving | Flask wraps the prediction pipeline and exposes it over HTTP |
336
+
337
+ ---
338
+
339
+ ## About the Author
340
+
341
+ **Paul Sentongo** is a data scientist and applied AI researcher with a Master's degree in Data Science. He is passionate about building machine learning systems that go beyond the notebook: reproducible, traceable, and deployable. His research interests include deep learning for medical imaging, MLOps infrastructure, and the practical challenges of making AI work in the real world.
342
+
343
+ Paul is currently open to research positions and industry roles where he can contribute to meaningful AI projects and grow alongside motivated teams.
344
+
345
+ - GitHub: [github.com/sentongo-web](https://github.com/sentongo-web)
346
+ - LinkedIn: [linkedin.com/in/paul-sentongo-885041284](https://www.linkedin.com/in/paul-sentongo-885041284/)
347
+ - Email: sentongogray1992@gmail.com
app.py ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from flask import Flask, request, jsonify, render_template
3
+ from flask_cors import CORS
4
+ from cnnClassifier.pipeline.prediction import PredictionPipeline
5
+
6
+ app = Flask(__name__)
7
+ CORS(app)
8
+
9
+ UPLOAD_FOLDER = "uploads"
10
+ os.makedirs(UPLOAD_FOLDER, exist_ok=True)
11
+
12
+
13
+ @app.route("/", methods=["GET"])
14
+ def home():
15
+ return render_template("index.html")
16
+
17
+
18
+ @app.route("/train", methods=["GET", "POST"])
19
+ def train():
20
+ os.system("python main.py")
21
+ return "Training completed successfully!"
22
+
23
+
24
+ @app.route("/predict", methods=["POST"])
25
+ def predict():
26
+ if "file" not in request.files:
27
+ return jsonify({"error": "No file uploaded"}), 400
28
+
29
+ file = request.files["file"]
30
+ if file.filename == "":
31
+ return jsonify({"error": "No file selected"}), 400
32
+
33
+ filepath = os.path.join(UPLOAD_FOLDER, file.filename)
34
+ file.save(filepath)
35
+
36
+ pipeline = PredictionPipeline(filepath)
37
+ result = pipeline.predict()
38
+
39
+ return jsonify(result)
40
+
41
+
42
+ if __name__ == "__main__":
43
+ port = int(os.environ.get("PORT", 7860))
44
+ app.run(host="0.0.0.0", port=port, debug=False)
artifacts/prepare_base_model/base_model.h5 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f6fc070728f3f1ce3d0f140b0ffeee893dc470fec846f5aaa0246f315c2fcb6b
3
+ size 58926080
artifacts/prepare_base_model/base_model_updated.h5 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f9b58a14a5bb7222c23e8400d8b8e820e4557588d47cbe666b28f9c89313f6bc
3
+ size 59147544
artifacts/training/model.h5 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8e1b5f5c330dcc32a5c98f34893464c63d8dc755cec7fe0bef0a449168cf7b2f
3
+ size 59147544
config/config.yaml ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ artifacts_root: artifacts
2
+ data_ingestion:
3
+ root_dir: artifacts/data_ingestion
4
+ source_URL: "https://drive.google.com/file/d/16PZpADG4Pl_SBr2E3DEcvXsLQ5DSUtDP/view?usp=sharing"
5
+ local_data_file: artifacts/data_ingestion/data.zip
6
+ unzip_dir: artifacts/data_ingestion
7
+
8
+ prepare_base_model:
9
+ root_dir: artifacts/prepare_base_model
10
+ base_model_path: artifacts/prepare_base_model/base_model.h5
11
+ updated_base_model_path: artifacts/prepare_base_model/base_model_updated.h5
12
+
13
+ training:
14
+ root_dir: artifacts/training
15
+ trained_model_path: artifacts/training/model.h5
16
+
17
+ evaluation:
18
+ path_of_model: artifacts/training/model.h5
19
+ training_data: artifacts/data_ingestion/kidney-ct-scan-image
20
+ mlflow_uri: "https://dagshub.com/sentongo-web/Kidney_classification_Using_MLOPS_and_DVC_Data-version-control.mlflow"
21
+ all_params:
22
+ AUGMENTATION: True
23
+ IMAGE_SIZE: [224, 224, 3]
24
+ BATCH_SIZE: 16
25
+ INCLUDE_TOP: False
26
+ EPOCHS: 5
27
+ CLASSES: 2
28
+ WEIGHTS: imagenet
29
+ LEARNING_RATE: 0.01
dvc.lock ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ schema: '2.0'
2
+ stages:
3
+ data_ingestion:
4
+ cmd: python src/cnnClassifier/pipeline/stage_01_data_ingestion.py
5
+ deps:
6
+ - path: config/config.yaml
7
+ hash: md5
8
+ md5: 20cd3ab789ce919b3687442bc4f2ab85
9
+ size: 1016
10
+ - path: src/cnnClassifier/components/data_ingestion.py
11
+ hash: md5
12
+ md5: f07cc7fed589b1f7d14a637aa94a0433
13
+ size: 1039
14
+ - path: src/cnnClassifier/config/configuration.py
15
+ hash: md5
16
+ md5: d41b12b3ad8d16ad963a9a34118da1ec
17
+ size: 2873
18
+ - path: src/cnnClassifier/pipeline/stage_01_data_ingestion.py
19
+ hash: md5
20
+ md5: e501eeb64cda076b3e15b447e55d6463
21
+ size: 908
22
+ outs:
23
+ - path: artifacts/data_ingestion
24
+ hash: md5
25
+ md5: 86510b1e2ff6da777357ccfdc278e4c8.dir
26
+ size: 116493584
27
+ nfiles: 466
28
+ prepare_base_model:
29
+ cmd: python src/cnnClassifier/pipeline/stage_02_prepare_base_model.py
30
+ deps:
31
+ - path: config/config.yaml
32
+ hash: md5
33
+ md5: 20cd3ab789ce919b3687442bc4f2ab85
34
+ size: 1016
35
+ - path: params.yaml
36
+ hash: md5
37
+ md5: 156a3b540bf80876a34e08b09faaf4fb
38
+ size: 151
39
+ - path: src/cnnClassifier/components/prepare_base_model.py
40
+ hash: md5
41
+ md5: 3c5230f332299193cb460420f3ce5057
42
+ size: 2063
43
+ - path: src/cnnClassifier/config/configuration.py
44
+ hash: md5
45
+ md5: d41b12b3ad8d16ad963a9a34118da1ec
46
+ size: 2873
47
+ - path: src/cnnClassifier/pipeline/stage_02_prepare_base_model.py
48
+ hash: md5
49
+ md5: c886276ae57285dac8969b02bf9077ed
50
+ size: 954
51
+ params:
52
+ params.yaml:
53
+ CLASSES: 2
54
+ IMAGE_SIZE:
55
+ - 224
56
+ - 224
57
+ - 3
58
+ INCLUDE_TOP: false
59
+ LEARNING_RATE: 0.01
60
+ WEIGHTS: imagenet
61
+ outs:
62
+ - path: artifacts/prepare_base_model
63
+ hash: md5
64
+ md5: d761158dc61a51df0233a4d98a02499f.dir
65
+ size: 118073624
66
+ nfiles: 2
67
+ training:
68
+ cmd: python src/cnnClassifier/pipeline/stage_03_model_trainer.py
69
+ deps:
70
+ - path: artifacts/data_ingestion/kidney-ct-scan-image
71
+ hash: md5
72
+ md5: 33ed59dbe5dec8ce2bb8e489b55203e4.dir
73
+ size: 58936381
74
+ nfiles: 465
75
+ - path: artifacts/prepare_base_model/base_model_updated.h5
76
+ hash: md5
77
+ md5: 12a1e3ebb90d89346ff2beb4fa21053b
78
+ size: 59147544
79
+ - path: config/config.yaml
80
+ hash: md5
81
+ md5: 20cd3ab789ce919b3687442bc4f2ab85
82
+ size: 1016
83
+ - path: params.yaml
84
+ hash: md5
85
+ md5: 156a3b540bf80876a34e08b09faaf4fb
86
+ size: 151
87
+ - path: src/cnnClassifier/components/model_trainer.py
88
+ hash: md5
89
+ md5: bc19f92e2812f36ba12a7c66730a6e21
90
+ size: 2675
91
+ - path: src/cnnClassifier/config/configuration.py
92
+ hash: md5
93
+ md5: d41b12b3ad8d16ad963a9a34118da1ec
94
+ size: 2873
95
+ - path: src/cnnClassifier/pipeline/stage_03_model_trainer.py
96
+ hash: md5
97
+ md5: 0951b497a475aac360e1c73b347b2295
98
+ size: 885
99
+ params:
100
+ params.yaml:
101
+ AUGMENTATION: true
102
+ BATCH_SIZE: 16
103
+ EPOCHS: 5
104
+ IMAGE_SIZE:
105
+ - 224
106
+ - 224
107
+ - 3
108
+ LEARNING_RATE: 0.01
109
+ outs:
110
+ - path: artifacts/training/model.h5
111
+ hash: md5
112
+ md5: 87e2a46b9573a6bba1da41192f0dff18
113
+ size: 59147544
114
+ evaluation:
115
+ cmd: python src/cnnClassifier/pipeline/stage_04_model_evaluation.py
116
+ deps:
117
+ - path: artifacts/training/model.h5
118
+ hash: md5
119
+ md5: 87e2a46b9573a6bba1da41192f0dff18
120
+ size: 59147544
121
+ - path: config/config.yaml
122
+ hash: md5
123
+ md5: 20cd3ab789ce919b3687442bc4f2ab85
124
+ size: 1016
125
+ - path: src/cnnClassifier/components/model_evaluation_mlflow.py
126
+ hash: md5
127
+ md5: 4612a6a44af8961549348656ece0c848
128
+ size: 2306
129
+ - path: src/cnnClassifier/config/configuration.py
130
+ hash: md5
131
+ md5: d41b12b3ad8d16ad963a9a34118da1ec
132
+ size: 2873
133
+ - path: src/cnnClassifier/pipeline/stage_04_model_evaluation.py
134
+ hash: md5
135
+ md5: dff6dc7d6115804b764c98c53f3bbc43
136
+ size: 908
137
+ params:
138
+ params.yaml:
139
+ BATCH_SIZE: 16
140
+ IMAGE_SIZE:
141
+ - 224
142
+ - 224
143
+ - 3
144
+ outs:
145
+ - path: scores.json
146
+ hash: md5
147
+ md5: 1fad9e68e2b1611a3fc59b064c62106e
148
+ size: 72
dvc.yaml ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ stages:
2
+ data_ingestion:
3
+ cmd: python src/cnnClassifier/pipeline/stage_01_data_ingestion.py
4
+ deps:
5
+ - src/cnnClassifier/pipeline/stage_01_data_ingestion.py
6
+ - src/cnnClassifier/components/data_ingestion.py
7
+ - src/cnnClassifier/config/configuration.py
8
+ - config/config.yaml
9
+ outs:
10
+ - artifacts/data_ingestion
11
+
12
+ prepare_base_model:
13
+ cmd: python src/cnnClassifier/pipeline/stage_02_prepare_base_model.py
14
+ deps:
15
+ - src/cnnClassifier/pipeline/stage_02_prepare_base_model.py
16
+ - src/cnnClassifier/components/prepare_base_model.py
17
+ - src/cnnClassifier/config/configuration.py
18
+ - config/config.yaml
19
+ - params.yaml
20
+ params:
21
+ - IMAGE_SIZE
22
+ - INCLUDE_TOP
23
+ - CLASSES
24
+ - WEIGHTS
25
+ - LEARNING_RATE
26
+ outs:
27
+ - artifacts/prepare_base_model
28
+
29
+ training:
30
+ cmd: python src/cnnClassifier/pipeline/stage_03_model_trainer.py
31
+ deps:
32
+ - src/cnnClassifier/pipeline/stage_03_model_trainer.py
33
+ - src/cnnClassifier/components/model_trainer.py
34
+ - src/cnnClassifier/config/configuration.py
35
+ - config/config.yaml
36
+ - params.yaml
37
+ - artifacts/data_ingestion/kidney-ct-scan-image
38
+ - artifacts/prepare_base_model/base_model_updated.h5
39
+ params:
40
+ - IMAGE_SIZE
41
+ - EPOCHS
42
+ - BATCH_SIZE
43
+ - AUGMENTATION
44
+ - LEARNING_RATE
45
+ outs:
46
+ - artifacts/training/model.h5
47
+
48
+ evaluation:
49
+ cmd: python src/cnnClassifier/pipeline/stage_04_model_evaluation.py
50
+ deps:
51
+ - src/cnnClassifier/pipeline/stage_04_model_evaluation.py
52
+ - src/cnnClassifier/components/model_evaluation_mlflow.py
53
+ - src/cnnClassifier/config/configuration.py
54
+ - config/config.yaml
55
+ - artifacts/training/model.h5
56
+ params:
57
+ - IMAGE_SIZE
58
+ - BATCH_SIZE
59
+ metrics:
60
+ - scores.json:
61
+ cache: false
main.py ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from cnnClassifier import logger
2
+ from cnnClassifier.pipeline.stage_01_data_ingestion import DataIngestionTrainingPipeline
3
+ from cnnClassifier.pipeline.stage_02_prepare_base_model import PrepareBaseModelTrainingPipeline
4
+ from cnnClassifier.pipeline.stage_03_model_trainer import ModelTrainerTrainingPipeline
5
+ from cnnClassifier.pipeline.stage_04_model_evaluation import ModelEvaluationPipeline
6
+
7
+ STAGE_NAME = "Data Ingestion stage"
8
+ try:
9
+ logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
10
+ DataIngestionTrainingPipeline().main()
11
+ logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<" + chr(10) + chr(10) + "x==========x")
12
+ except Exception as e:
13
+ logger.exception(e)
14
+ raise e
15
+
16
+ STAGE_NAME = "Prepare Base Model stage"
17
+ try:
18
+ logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
19
+ PrepareBaseModelTrainingPipeline().main()
20
+ logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<" + chr(10) + chr(10) + "x==========x")
21
+ except Exception as e:
22
+ logger.exception(e)
23
+ raise e
24
+
25
+ STAGE_NAME = "Training stage"
26
+ try:
27
+ logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
28
+ ModelTrainerTrainingPipeline().main()
29
+ logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<" + chr(10) + chr(10) + "x==========x")
30
+ except Exception as e:
31
+ logger.exception(e)
32
+ raise e
33
+
34
+ STAGE_NAME = "Model Evaluation stage"
35
+ try:
36
+ logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
37
+ ModelEvaluationPipeline().main()
38
+ logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<" + chr(10) + chr(10) + "x==========x")
39
+ except Exception as e:
40
+ logger.exception(e)
41
+ raise e
params.yaml ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ AUGMENTATION: True
2
+ IMAGE_SIZE:
3
+ - 224
4
+ - 224
5
+ - 3
6
+ BATCH_SIZE: 16
7
+ INCLUDE_TOP: False
8
+ EPOCHS: 5
9
+ CLASSES: 2
10
+ WEIGHTS: imagenet
11
+ LEARNING_RATE: 0.01
requirements.txt ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ tensorflow
2
+ keras
3
+ dvc
4
+ numpy
5
+ pandas
6
+ scikit-learn
7
+ matplotlib
8
+ seaborn
9
+ jupyterlab
10
+ scipy
11
+ mlflow
12
+ notebook
13
+ python-box
14
+ pyYAML
15
+ tqdm
16
+ joblib
17
+ types-PyYAML
18
+ Flask
19
+ Flask-Cors
20
+ gdown
21
+ ensure
22
+ python-dotenv
setup.py ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import setuptools
2
+
3
+ with open("README.md", "r", encoding="utf-8") as f:
4
+ long_description = f.read()
5
+
6
+ __version__ = "0.0.0"
7
+
8
+ REPO_NAME = "Kidney_classification_Using_MLOPS_and_DVC_Data-version-control"
9
+
10
+ AUTHOR_USER_NAME = "sentongo-web"
11
+ SRC_REPO = "cnnClassifier"
12
+ AUTHOR_EMAIL = "sentongogray1992@gmail.com"
13
+
14
+ setuptools.setup(
15
+ name=SRC_REPO,
16
+ version=__version__,
17
+ author=AUTHOR_USER_NAME,
18
+ author_email=AUTHOR_EMAIL,
19
+ description="A machine learning project for kidney classification using MLOps and DVC.",
20
+ long_description=long_description,
21
+ long_description_content_type="text/markdown",
22
+ url=f"https://github.com/{AUTHOR_USER_NAME}/{REPO_NAME}",
23
+ package_dir={"": "src"},
24
+ packages=setuptools.find_packages(where="src")
25
+ )
src/cnnClassifier/__init__.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import sys
3
+ import logging
4
+
5
+ logging_str = "[%(asctime)s: %(levelname)s: %(module)s]: %(message)s"
6
+
7
+ log_dir = "logs"
8
+ log_filepath = os.path.join(log_dir,"running_logs.log")
9
+ os.makedirs(log_dir, exist_ok=True)
10
+
11
+ logging.basicConfig(
12
+ level=logging.INFO,
13
+ format=logging_str,
14
+ handlers=[
15
+ logging.FileHandler(log_filepath),
16
+ logging.StreamHandler(sys.stdout) # type: ignore
17
+ ]
18
+ )
19
+
20
+ logger = logging.getLogger("cnnClassifierLogger")
src/cnnClassifier/components/__init__.py ADDED
File without changes
src/cnnClassifier/components/data_ingestion.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import zipfile
3
+ import gdown
4
+ from pathlib import Path
5
+ from cnnClassifier import logger
6
+ from cnnClassifier.utils.common import get_size
7
+ from cnnClassifier.entity.config_entity import DataIngestionConfig
8
+
9
+
10
+ class DataIngestion:
11
+ def __init__(self, config: DataIngestionConfig):
12
+ self.config = config
13
+
14
+ def download_file(self):
15
+ if not os.path.exists(self.config.local_data_file):
16
+ gdown.download(self.config.source_URL, str(self.config.local_data_file), quiet=False, fuzzy=True)
17
+ logger.info(f"Downloaded data to {self.config.local_data_file}")
18
+ else:
19
+ logger.info(f"File already exists of size: {get_size(Path(self.config.local_data_file))}")
20
+
21
+ def extract_zip_file(self):
22
+ """Extracts the zip file into the unzip directory."""
23
+ unzip_path = self.config.unzip_dir
24
+ os.makedirs(unzip_path, exist_ok=True)
25
+ with zipfile.ZipFile(self.config.local_data_file, 'r') as zip_ref:
26
+ zip_ref.extractall(unzip_path)
src/cnnClassifier/components/model_evaluation_mlflow.py ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import tensorflow as tf
3
+ from pathlib import Path
4
+ import dagshub
5
+ import mlflow
6
+ import mlflow.tensorflow
7
+ from urllib.parse import urlparse
8
+ from cnnClassifier.entity.config_entity import EvaluationConfig
9
+ from cnnClassifier.utils.common import save_json
10
+
11
+
12
+ class Evaluation:
13
+ def __init__(self, config: EvaluationConfig):
14
+ self.config = config
15
+
16
+ def _valid_generator(self):
17
+ datagenerator_kwargs = dict(rescale=1.0 / 255, validation_split=0.30)
18
+ dataflow_kwargs = dict(
19
+ target_size=self.config.params_image_size[:-1],
20
+ batch_size=self.config.params_batch_size,
21
+ interpolation="bilinear"
22
+ )
23
+ valid_datagenerator = tf.keras.preprocessing.image.ImageDataGenerator(
24
+ **datagenerator_kwargs
25
+ )
26
+ self.valid_generator = valid_datagenerator.flow_from_directory(
27
+ directory=self.config.training_data,
28
+ subset="validation",
29
+ shuffle=False,
30
+ **dataflow_kwargs
31
+ )
32
+
33
+ @staticmethod
34
+ def load_model(path: Path) -> tf.keras.Model:
35
+ return tf.keras.models.load_model(path)
36
+
37
+ def evaluation(self):
38
+ self.model = self.load_model(self.config.path_of_model)
39
+ self._valid_generator()
40
+ self.score = self.model.evaluate(self.valid_generator)
41
+ self.save_score()
42
+
43
+ def save_score(self):
44
+ scores = {"loss": self.score[0], "accuracy": self.score[1]}
45
+ save_json(path=Path("scores.json"), data=scores)
46
+
47
+ def log_into_mlflow(self):
48
+ dagshub.init(
49
+ repo_owner="sentongo-web",
50
+ repo_name="Kidney_classification_Using_MLOPS_and_DVC_Data-version-control",
51
+ mlflow=True
52
+ )
53
+ mlflow.set_registry_uri(self.config.mlflow_uri)
54
+ tracking_url_type_store = urlparse(mlflow.get_tracking_uri()).scheme
55
+
56
+ with mlflow.start_run():
57
+ mlflow.log_params(self.config.all_params)
58
+ mlflow.log_metrics({"loss": self.score[0], "accuracy": self.score[1]})
59
+
60
+ if tracking_url_type_store != "file":
61
+ mlflow.tensorflow.log_model(self.model, "model", registered_model_name="VGG16Model")
62
+ else:
63
+ mlflow.tensorflow.log_model(self.model, "model")
src/cnnClassifier/components/model_trainer.py ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import tensorflow as tf
3
+ from pathlib import Path
4
+ from cnnClassifier.entity.config_entity import TrainingConfig
5
+
6
+
7
+ class Training:
8
+ def __init__(self, config: TrainingConfig):
9
+ self.config = config
10
+
11
+ def get_base_model(self):
12
+ self.model = tf.keras.models.load_model(
13
+ self.config.updated_base_model_path, compile=False
14
+ )
15
+ self.model.compile(
16
+ optimizer=tf.keras.optimizers.SGD(learning_rate=self.config.params_learning_rate),
17
+ loss=tf.keras.losses.CategoricalCrossentropy(),
18
+ metrics=["accuracy"]
19
+ )
20
+
21
+ def train_valid_generator(self):
22
+ datagenerator_kwargs = dict(rescale=1.0 / 255, validation_split=0.20)
23
+
24
+ dataflow_kwargs = dict(
25
+ target_size=self.config.params_image_size[:-1],
26
+ batch_size=self.config.params_batch_size,
27
+ interpolation="bilinear"
28
+ )
29
+
30
+ valid_datagenerator = tf.keras.preprocessing.image.ImageDataGenerator(
31
+ **datagenerator_kwargs
32
+ )
33
+
34
+ self.valid_generator = valid_datagenerator.flow_from_directory(
35
+ directory=self.config.training_data,
36
+ subset="validation",
37
+ shuffle=False,
38
+ **dataflow_kwargs
39
+ )
40
+
41
+ if self.config.params_is_augmentation:
42
+ train_datagenerator = tf.keras.preprocessing.image.ImageDataGenerator(
43
+ rotation_range=40,
44
+ horizontal_flip=True,
45
+ width_shift_range=0.2,
46
+ height_shift_range=0.2,
47
+ shear_range=0.2,
48
+ zoom_range=0.2,
49
+ **datagenerator_kwargs
50
+ )
51
+ else:
52
+ train_datagenerator = valid_datagenerator
53
+
54
+ self.train_generator = train_datagenerator.flow_from_directory(
55
+ directory=self.config.training_data,
56
+ subset="training",
57
+ shuffle=True,
58
+ **dataflow_kwargs
59
+ )
60
+
61
+ @staticmethod
62
+ def save_model(path: Path, model: tf.keras.Model):
63
+ model.save(path)
64
+
65
+ def train(self):
66
+ self.steps_per_epoch = self.train_generator.samples // self.train_generator.batch_size
67
+ self.validation_steps = self.valid_generator.samples // self.valid_generator.batch_size
68
+
69
+ self.model.fit(
70
+ self.train_generator,
71
+ epochs=self.config.params_epochs,
72
+ steps_per_epoch=self.steps_per_epoch,
73
+ validation_steps=self.validation_steps,
74
+ validation_data=self.valid_generator
75
+ )
76
+
77
+ self.save_model(path=self.config.trained_model_path, model=self.model)
src/cnnClassifier/components/prepare_base_model.py ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import urllib.request as request
3
+ from pathlib import Path
4
+ import tensorflow as tf
5
+ from cnnClassifier.entity.config_entity import PrepareBaseModelConfig
6
+
7
+
8
+ class PrepareBaseModel:
9
+ def __init__(self, config: PrepareBaseModelConfig):
10
+ self.config = config
11
+
12
+ def get_base_model(self):
13
+ self.model = tf.keras.applications.VGG16(
14
+ input_shape=self.config.params_image_size,
15
+ weights=self.config.params_weights,
16
+ include_top=self.config.params_include_top
17
+ )
18
+ self.save_model(path=self.config.base_model_path, model=self.model)
19
+
20
+ @staticmethod
21
+ def _prepare_full_model(model, classes, freeze_all, freeze_till, learning_rate):
22
+ if freeze_all:
23
+ for layer in model.layers:
24
+ layer.trainable = False
25
+ elif freeze_till is not None and freeze_till > 0:
26
+ for layer in model.layers[:-freeze_till]:
27
+ layer.trainable = False
28
+
29
+ flatten_in = tf.keras.layers.Flatten()(model.output)
30
+ prediction = tf.keras.layers.Dense(
31
+ units=classes,
32
+ activation="softmax"
33
+ )(flatten_in)
34
+
35
+ full_model = tf.keras.models.Model(
36
+ inputs=model.input,
37
+ outputs=prediction
38
+ )
39
+ full_model.compile(
40
+ optimizer=tf.keras.optimizers.SGD(learning_rate=learning_rate),
41
+ loss=tf.keras.losses.CategoricalCrossentropy(),
42
+ metrics=["accuracy"]
43
+ )
44
+ full_model.summary()
45
+ return full_model
46
+
47
+ def update_base_model(self):
48
+ self.full_model = self._prepare_full_model(
49
+ model=self.model,
50
+ classes=self.config.params_classes,
51
+ freeze_all=True,
52
+ freeze_till=None,
53
+ learning_rate=self.config.params_learning_rate
54
+ )
55
+ self.save_model(path=self.config.updated_base_model_path, model=self.full_model)
56
+
57
+ @staticmethod
58
+ def save_model(path: Path, model: tf.keras.Model):
59
+ model.save(path)
src/cnnClassifier/config/__init__.py ADDED
File without changes
src/cnnClassifier/config/configuration.py ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from cnnClassifier.constants import CONFIG_FILE_PATH, PARAMS_FILE_PATH
2
+ from cnnClassifier.utils.common import read_yaml, create_directories
3
+ from cnnClassifier.entity.config_entity import DataIngestionConfig, PrepareBaseModelConfig, TrainingConfig, EvaluationConfig
4
+ from pathlib import Path
5
+
6
+
7
+ class ConfigurationManager:
8
+ def __init__(
9
+ self,
10
+ config_filepath=CONFIG_FILE_PATH,
11
+ params_filepath=PARAMS_FILE_PATH
12
+ ):
13
+ self.config = read_yaml(config_filepath)
14
+ self.params = read_yaml(params_filepath)
15
+ create_directories([self.config.artifacts_root])
16
+
17
+ def get_data_ingestion_config(self) -> DataIngestionConfig:
18
+ config = self.config.data_ingestion
19
+ create_directories([config.root_dir])
20
+ return DataIngestionConfig(
21
+ root_dir=config.root_dir,
22
+ source_URL=config.source_URL,
23
+ local_data_file=config.local_data_file,
24
+ unzip_dir=config.unzip_dir
25
+ )
26
+
27
+ def get_prepare_base_model_config(self) -> PrepareBaseModelConfig:
28
+ config = self.config.prepare_base_model
29
+ create_directories([config.root_dir])
30
+ return PrepareBaseModelConfig(
31
+ root_dir=config.root_dir,
32
+ base_model_path=config.base_model_path,
33
+ updated_base_model_path=config.updated_base_model_path,
34
+ params_image_size=self.params.IMAGE_SIZE,
35
+ params_learning_rate=self.params.LEARNING_RATE,
36
+ params_include_top=self.params.INCLUDE_TOP,
37
+ params_weights=self.params.WEIGHTS,
38
+ params_classes=self.params.CLASSES
39
+ )
40
+
41
+ def get_training_config(self) -> TrainingConfig:
42
+ training = self.config.training
43
+ prepare_base_model = self.config.prepare_base_model
44
+ training_data = Path(self.config.data_ingestion.unzip_dir) / "kidney-ct-scan-image"
45
+ create_directories([training.root_dir])
46
+ return TrainingConfig(
47
+ root_dir=training.root_dir,
48
+ trained_model_path=training.trained_model_path,
49
+ updated_base_model_path=prepare_base_model.updated_base_model_path,
50
+ training_data=training_data,
51
+ params_epochs=self.params.EPOCHS,
52
+ params_batch_size=self.params.BATCH_SIZE,
53
+ params_is_augmentation=self.params.AUGMENTATION,
54
+ params_image_size=self.params.IMAGE_SIZE,
55
+ params_learning_rate=self.params.LEARNING_RATE,
56
+ )
57
+
58
+ def get_evaluation_config(self) -> EvaluationConfig:
59
+ config = self.config.evaluation
60
+ return EvaluationConfig(
61
+ path_of_model=config.path_of_model,
62
+ training_data=config.training_data,
63
+ all_params=dict(config.all_params),
64
+ mlflow_uri=config.mlflow_uri,
65
+ params_image_size=self.params.IMAGE_SIZE,
66
+ params_batch_size=self.params.BATCH_SIZE,
67
+ )
src/cnnClassifier/constants/__init__.py ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ from pathlib import Path
2
+
3
+ CONFIG_FILE_PATH: Path = Path("config/config.yaml")
4
+ PARAMS_FILE_PATH: Path = Path("params.yaml")
src/cnnClassifier/entity/__init__.py ADDED
File without changes
src/cnnClassifier/entity/config_entity.py ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from dataclasses import dataclass
2
+ from pathlib import Path
3
+
4
+
5
+ @dataclass(frozen=True)
6
+ class DataIngestionConfig:
7
+ root_dir: Path
8
+ source_URL: str
9
+ local_data_file: Path
10
+ unzip_dir: Path
11
+
12
+
13
+ @dataclass(frozen=True)
14
+ class PrepareBaseModelConfig:
15
+ root_dir: Path
16
+ base_model_path: Path
17
+ updated_base_model_path: Path
18
+ params_image_size: list
19
+ params_learning_rate: float
20
+ params_include_top: bool
21
+ params_weights: str
22
+ params_classes: int
23
+
24
+
25
+ @dataclass(frozen=True)
26
+ class TrainingConfig:
27
+ root_dir: Path
28
+ trained_model_path: Path
29
+ updated_base_model_path: Path
30
+ training_data: Path
31
+ params_epochs: int
32
+ params_batch_size: int
33
+ params_is_augmentation: bool
34
+ params_image_size: list
35
+ params_learning_rate: float
36
+
37
+
38
+ @dataclass(frozen=True)
39
+ class EvaluationConfig:
40
+ path_of_model: Path
41
+ training_data: Path
42
+ all_params: dict
43
+ mlflow_uri: str
44
+ params_image_size: list
45
+ params_batch_size: int
src/cnnClassifier/pipeline/__init__.py ADDED
File without changes
src/cnnClassifier/pipeline/prediction.py ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+ from tensorflow.keras.models import load_model
3
+ from tensorflow.keras.preprocessing import image
4
+ import os
5
+
6
+
7
+ class PredictionPipeline:
8
+ def __init__(self, filename):
9
+ self.filename = filename
10
+
11
+ def predict(self):
12
+ model = load_model(os.path.join("artifacts", "training", "model.h5"))
13
+
14
+ img = image.load_img(self.filename, target_size=(224, 224))
15
+ img_array = image.img_to_array(img)
16
+ img_array = np.expand_dims(img_array, axis=0) / 255.0
17
+
18
+ result = np.argmax(model.predict(img_array), axis=1)
19
+
20
+ if result[0] == 1:
21
+ prediction = "Tumor"
22
+ else:
23
+ prediction = "Normal"
24
+
25
+ return [{"image": prediction}]
src/cnnClassifier/pipeline/stage_01_data_ingestion.py ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from cnnClassifier import logger
2
+ from cnnClassifier.config.configuration import ConfigurationManager
3
+ from cnnClassifier.components.data_ingestion import DataIngestion
4
+
5
+ STAGE_NAME = "Data Ingestion stage"
6
+
7
+
8
+ class DataIngestionTrainingPipeline:
9
+ def __init__(self):
10
+ pass
11
+
12
+ def main(self):
13
+ config = ConfigurationManager()
14
+ data_ingestion_config = config.get_data_ingestion_config()
15
+ data_ingestion = DataIngestion(config=data_ingestion_config)
16
+ data_ingestion.download_file()
17
+ data_ingestion.extract_zip_file()
18
+
19
+
20
+ if __name__ == '__main__':
21
+ try:
22
+ logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
23
+ obj = DataIngestionTrainingPipeline()
24
+ obj.main()
25
+ logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<\n\nx==========x")
26
+ except Exception as e:
27
+ logger.exception(e)
28
+ raise e
src/cnnClassifier/pipeline/stage_02_prepare_base_model.py ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from cnnClassifier import logger
2
+ from cnnClassifier.config.configuration import ConfigurationManager
3
+ from cnnClassifier.components.prepare_base_model import PrepareBaseModel
4
+
5
+ STAGE_NAME = "Prepare Base Model stage"
6
+
7
+
8
+ class PrepareBaseModelTrainingPipeline:
9
+ def __init__(self):
10
+ pass
11
+
12
+ def main(self):
13
+ config = ConfigurationManager()
14
+ prepare_base_model_config = config.get_prepare_base_model_config()
15
+ prepare_base_model = PrepareBaseModel(config=prepare_base_model_config)
16
+ prepare_base_model.get_base_model()
17
+ prepare_base_model.update_base_model()
18
+
19
+
20
+ if __name__ == '__main__':
21
+ try:
22
+ logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
23
+ obj = PrepareBaseModelTrainingPipeline()
24
+ obj.main()
25
+ logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<\n\nx==========x")
26
+ except Exception as e:
27
+ logger.exception(e)
28
+ raise e
src/cnnClassifier/pipeline/stage_03_model_trainer.py ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from cnnClassifier import logger
2
+ from cnnClassifier.config.configuration import ConfigurationManager
3
+ from cnnClassifier.components.model_trainer import Training
4
+
5
+ STAGE_NAME = "Training stage"
6
+
7
+
8
+ class ModelTrainerTrainingPipeline:
9
+ def __init__(self):
10
+ pass
11
+
12
+ def main(self):
13
+ config = ConfigurationManager()
14
+ training_config = config.get_training_config()
15
+ training = Training(config=training_config)
16
+ training.get_base_model()
17
+ training.train_valid_generator()
18
+ training.train()
19
+
20
+
21
+ if __name__ == '__main__':
22
+ try:
23
+ logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
24
+ obj = ModelTrainerTrainingPipeline()
25
+ obj.main()
26
+ logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<\n\nx==========x")
27
+ except Exception as e:
28
+ logger.exception(e)
29
+ raise e
src/cnnClassifier/pipeline/stage_04_model_evaluation.py ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from dotenv import load_dotenv
2
+ load_dotenv()
3
+ from cnnClassifier import logger
4
+ from cnnClassifier.config.configuration import ConfigurationManager
5
+ from cnnClassifier.components.model_evaluation_mlflow import Evaluation
6
+
7
+ STAGE_NAME = "Model Evaluation stage"
8
+
9
+
10
+ class ModelEvaluationPipeline:
11
+ def __init__(self):
12
+ pass
13
+
14
+ def main(self):
15
+ config = ConfigurationManager()
16
+ eval_config = config.get_evaluation_config()
17
+ evaluation = Evaluation(config=eval_config)
18
+ evaluation.evaluation()
19
+ evaluation.log_into_mlflow()
20
+
21
+
22
+ if __name__ == '__main__':
23
+ try:
24
+ logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
25
+ obj = ModelEvaluationPipeline()
26
+ obj.main()
27
+ logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<\n\nx==========x")
28
+ except Exception as e:
29
+ logger.exception(e)
30
+ raise e
src/cnnClassifier/utils/__init__.py ADDED
File without changes
src/cnnClassifier/utils/common.py ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import json
3
+ import base64
4
+ import joblib # type: ignore[import-untyped]
5
+ import yaml
6
+ from pathlib import Path
7
+ from typing import Any, cast
8
+ from box import ConfigBox # type: ignore[import-untyped]
9
+ from box.exceptions import BoxValueError # type: ignore[import-untyped]
10
+ from ensure import ensure_annotations # type: ignore[import-untyped]
11
+ from cnnClassifier import logger
12
+
13
+
14
+ @ensure_annotations
15
+ def read_yaml(path_to_yaml: Path) -> ConfigBox:
16
+ """Reads a YAML file and returns its content as a ConfigBox.
17
+
18
+ Args:
19
+ path_to_yaml (Path): Path to the YAML file.
20
+
21
+ Raises:
22
+ ValueError: If the YAML file is empty.
23
+ BoxValueError: If the YAML content is invalid.
24
+
25
+ Returns:
26
+ ConfigBox: Parsed YAML content with dot-access support.
27
+ """
28
+ try:
29
+ with open(path_to_yaml) as yaml_file:
30
+ content = yaml.safe_load(yaml_file)
31
+ if content is None:
32
+ raise ValueError(f"YAML file is empty: {path_to_yaml}")
33
+ logger.info(f"YAML file loaded successfully: {path_to_yaml}")
34
+ return ConfigBox(content)
35
+ except BoxValueError as e:
36
+ raise BoxValueError(f"Invalid YAML content in {path_to_yaml}: {e}")
37
+
38
+
39
+ def create_directories(path_to_directories: list[Path], verbose: bool = True) -> None:
40
+ """Creates a list of directories if they do not already exist.
41
+
42
+ Args:
43
+ path_to_directories (list[Path]): List of directory paths to create.
44
+ verbose (bool): Whether to log each created directory. Defaults to True.
45
+ """
46
+ for path in path_to_directories:
47
+ os.makedirs(str(path), exist_ok=True)
48
+ if verbose:
49
+ logger.info(f"Created directory: {path}")
50
+
51
+
52
+ def save_json(path: Path, data: dict[str, Any]) -> None:
53
+ """Saves a dictionary as a JSON file.
54
+
55
+ Args:
56
+ path (Path): Path where the JSON file will be saved.
57
+ data (dict[str, Any]): Dictionary to save.
58
+ """
59
+ with open(path, "w") as f:
60
+ json.dump(data, f, indent=4)
61
+ logger.info(f"JSON saved to: {path}")
62
+
63
+
64
+ @ensure_annotations
65
+ def load_json(path: Path) -> ConfigBox:
66
+ """Loads a JSON file and returns its content as a ConfigBox.
67
+
68
+ Args:
69
+ path (Path): Path to the JSON file.
70
+
71
+ Returns:
72
+ ConfigBox: JSON content with dot-access support.
73
+ """
74
+ with open(path) as f:
75
+ content = json.load(f)
76
+ logger.info(f"JSON loaded from: {path}")
77
+ return ConfigBox(content)
78
+
79
+
80
+ @ensure_annotations
81
+ def save_bin(data: Any, path: Path) -> None:
82
+ """Saves any Python object as a binary file using joblib.
83
+
84
+ Args:
85
+ data (Any): Object to serialize (e.g. model, scaler).
86
+ path (Path): Destination path for the binary file.
87
+ """
88
+ joblib.dump(value=data, filename=path) # type: ignore[no-untyped-call]
89
+ logger.info(f"Binary file saved to: {path}")
90
+
91
+
92
+ @ensure_annotations
93
+ def load_bin(path: Path) -> Any:
94
+ """Loads a binary file saved with joblib.
95
+
96
+ Args:
97
+ path (Path): Path to the binary file.
98
+
99
+ Returns:
100
+ Any: The deserialized Python object.
101
+ """
102
+ data: Any = cast(Any, joblib.load(path)) # type: ignore[no-untyped-call]
103
+ logger.info(f"Binary file loaded from: {path}")
104
+ return data
105
+
106
+
107
+ @ensure_annotations
108
+ def get_size(path: Path) -> str:
109
+ """Returns the size of a file in kilobytes (KB).
110
+
111
+ Args:
112
+ path (Path): Path to the file.
113
+
114
+ Returns:
115
+ str: File size as a human-readable string, e.g. "~ 24 KB".
116
+ """
117
+ size_in_kb = round(os.path.getsize(path) / 1024)
118
+ return f"~ {size_in_kb} KB"
119
+
120
+
121
+ def decode_image(imgstring: str, file_name: str) -> None:
122
+ """Decodes a base64-encoded image string and writes it to a file.
123
+ Used by the Flask prediction endpoint to receive images via API.
124
+
125
+ Args:
126
+ imgstring (str): Base64-encoded image string.
127
+ file_name (str): Destination file path to write the decoded image.
128
+ """
129
+ imgdata = base64.b64decode(imgstring)
130
+ with open(file_name, "wb") as f:
131
+ f.write(imgdata)
132
+ logger.info(f"Image decoded and saved to: {file_name}")
133
+
134
+
135
+ def encode_image_into_base64(image_path: str) -> str:
136
+ """Reads an image file and encodes it into a base64 string.
137
+ Used to return prediction results as base64 over the API.
138
+
139
+ Args:
140
+ image_path (str): Path to the image file.
141
+
142
+ Returns:
143
+ str: Base64-encoded string of the image.
144
+ """
145
+ with open(image_path, "rb") as f:
146
+ encoded = base64.b64encode(f.read()).decode("utf-8")
147
+ logger.info(f"Image encoded to base64 from: {image_path}")
148
+ return encoded
template.py ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from pathlib import Path
3
+ import logging
4
+ #logging string
5
+ logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
6
+
7
+ project_name = "cnnClassifier"
8
+ list_of_files = [
9
+ ".github/workflows/.gitkeep",
10
+ f"src/{project_name}/__init__.py",
11
+ f"src/{project_name}/components/__init__.py",
12
+ f"src/{project_name}/utils/__init__.py",
13
+ f"src/{project_name}/config/__init__.py",
14
+ f"src/{project_name}/config/configuration.py",
15
+ f"src/{project_name}/pipeline/__init__.py",
16
+ f"src/{project_name}/entity/__init__.py",
17
+ f"src/{project_name}/constants/__init__.py",
18
+ "config/config.yaml",
19
+ "dvc.yaml",
20
+ "params.yaml",
21
+ "requirements.txt",
22
+ "setup.py",
23
+ "research/trials.ipynb",
24
+ "templates/index.html"
25
+ ]
26
+
27
+ for filepath in list_of_files:
28
+ filepath = Path(filepath)
29
+ filedir, filename = os.path.split(filepath)
30
+ if filedir != "":
31
+ os.makedirs(filedir, exist_ok=True)
32
+ logging.info(f"Creating directory: {filedir} for file: {filename}")
33
+ if not os.path.exists(filepath) or os.path.getsize(filepath) == 0:
34
+ with open(filepath, "w") as f:
35
+ pass
36
+ logging.info(f"Creating empty file: {filepath}")
37
+ else:
38
+ logging.info(f"File already exists and is not empty: {filepath}")
templates/index.html ADDED
@@ -0,0 +1,728 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en" data-theme="light">
3
+ <head>
4
+ <meta charset="UTF-8" />
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0" />
6
+ <title>KidneyDL CT Scan Classifier</title>
7
+ <link rel="preconnect" href="https://fonts.googleapis.com" />
8
+ <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700;800&display=swap" rel="stylesheet" />
9
+ <style>
10
+ /* ── Theme tokens ─────────────────────────────────────────── */
11
+ :root {
12
+ --bg: #f0f5ff;
13
+ --surface: #ffffff;
14
+ --surface-alt: #f8fafc;
15
+ --border: #e2e8f0;
16
+ --text: #0f172a;
17
+ --text-muted: #64748b;
18
+ --accent: #3b82f6;
19
+ --accent-dark: #2563eb;
20
+ --accent-glow: rgba(59,130,246,0.15);
21
+ --success: #10b981;
22
+ --success-bg: #ecfdf5;
23
+ --success-bdr: #6ee7b7;
24
+ --danger: #ef4444;
25
+ --danger-bg: #fef2f2;
26
+ --danger-bdr: #fca5a5;
27
+ --shadow: 0 4px 32px rgba(15,23,42,0.08);
28
+ --shadow-lg: 0 8px 48px rgba(15,23,42,0.14);
29
+ --radius: 18px;
30
+ --radius-sm: 12px;
31
+ --ease: 0.25s ease;
32
+ }
33
+ [data-theme="dark"] {
34
+ --bg: #080f1e;
35
+ --surface: #111827;
36
+ --surface-alt: #1a2338;
37
+ --border: #1e2d45;
38
+ --text: #e2e8f0;
39
+ --text-muted: #94a3b8;
40
+ --accent: #60a5fa;
41
+ --accent-dark: #3b82f6;
42
+ --accent-glow: rgba(96,165,250,0.14);
43
+ --success: #34d399;
44
+ --success-bg: #022c22;
45
+ --success-bdr: #065f46;
46
+ --danger: #f87171;
47
+ --danger-bg: #2d0a0a;
48
+ --danger-bdr: #7f1d1d;
49
+ --shadow: 0 4px 32px rgba(0,0,0,0.45);
50
+ --shadow-lg: 0 8px 48px rgba(0,0,0,0.6);
51
+ }
52
+
53
+ /* ── Reset ────────────────────────────────────────────────── */
54
+ *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
55
+ html { scroll-behavior: smooth; }
56
+ body {
57
+ font-family: 'Inter', system-ui, sans-serif;
58
+ background: var(--bg);
59
+ color: var(--text);
60
+ min-height: 100vh;
61
+ transition: background var(--ease), color var(--ease);
62
+ line-height: 1.65;
63
+ }
64
+ a { color: var(--accent); text-decoration: none; transition: opacity 0.2s; }
65
+ a:hover { opacity: 0.75; }
66
+
67
+ /* ── Top bar ──────────────────────────────────────────────── */
68
+ .topbar {
69
+ position: sticky; top: 0; z-index: 100;
70
+ display: flex; align-items: center; justify-content: space-between;
71
+ padding: 14px 32px;
72
+ background: var(--surface);
73
+ border-bottom: 1px solid var(--border);
74
+ box-shadow: var(--shadow);
75
+ }
76
+ .topbar-brand {
77
+ display: flex; align-items: center; gap: 10px;
78
+ font-size: 1.05rem; font-weight: 800; letter-spacing: -0.5px;
79
+ color: var(--text);
80
+ }
81
+ .pulse {
82
+ width: 9px; height: 9px; border-radius: 50%;
83
+ background: var(--accent);
84
+ animation: pulseRing 2.2s ease infinite;
85
+ }
86
+ @keyframes pulseRing {
87
+ 0%, 100% { box-shadow: 0 0 0 0 var(--accent-glow); }
88
+ 50% { box-shadow: 0 0 0 8px rgba(0,0,0,0); }
89
+ }
90
+ .theme-btn {
91
+ display: flex; align-items: center; gap: 7px;
92
+ background: var(--surface-alt);
93
+ border: 1px solid var(--border);
94
+ border-radius: 999px;
95
+ padding: 6px 16px;
96
+ cursor: pointer;
97
+ font-family: inherit;
98
+ font-size: 0.78rem; font-weight: 600;
99
+ color: var(--text-muted);
100
+ transition: all var(--ease);
101
+ }
102
+ .theme-btn:hover { border-color: var(--accent); color: var(--text); }
103
+ .theme-btn svg { width: 14px; height: 14px; }
104
+
105
+ /* ── Page ─────────────────────────────────────────────────── */
106
+ .page { max-width: 960px; margin: 0 auto; padding: 52px 24px 88px; }
107
+
108
+ /* ── Hero ─────────────────────────────────────────────────── */
109
+ .hero { text-align: center; margin-bottom: 60px; }
110
+ .hero-badge {
111
+ display: inline-flex; align-items: center; gap: 7px;
112
+ background: var(--accent-glow);
113
+ border: 1px solid color-mix(in srgb, var(--accent) 50%, transparent);
114
+ color: var(--accent);
115
+ font-size: 0.7rem; font-weight: 700; letter-spacing: 0.1em;
116
+ text-transform: uppercase;
117
+ padding: 5px 16px; border-radius: 999px; margin-bottom: 22px;
118
+ }
119
+ .hero-badge .dot { width: 6px; height: 6px; border-radius: 50%; background: currentColor; }
120
+ .hero h1 {
121
+ font-size: clamp(2rem, 5.5vw, 3.2rem);
122
+ font-weight: 800; letter-spacing: -1.5px; line-height: 1.12;
123
+ margin-bottom: 18px;
124
+ background: linear-gradient(135deg, var(--text) 30%, var(--accent) 100%);
125
+ -webkit-background-clip: text; -webkit-text-fill-color: transparent;
126
+ background-clip: text;
127
+ }
128
+ .hero p {
129
+ font-size: 1.05rem; color: var(--text-muted);
130
+ max-width: 580px; margin: 0 auto; line-height: 1.75;
131
+ }
132
+
133
+ /* ── Cards ────────────────────────────────────────────────── */
134
+ .card {
135
+ background: var(--surface);
136
+ border: 1px solid var(--border);
137
+ border-radius: var(--radius);
138
+ box-shadow: var(--shadow);
139
+ padding: 36px;
140
+ transition: background var(--ease), border-color var(--ease);
141
+ }
142
+
143
+ /* ── Classifier layout ────────────────────────────────────── */
144
+ .classifier-grid {
145
+ display: grid; grid-template-columns: 1fr 1fr; gap: 24px;
146
+ }
147
+ @media (max-width: 620px) { .classifier-grid { grid-template-columns: 1fr; } }
148
+
149
+ .section-eyebrow {
150
+ font-size: 0.7rem; font-weight: 700; letter-spacing: 0.1em;
151
+ text-transform: uppercase; color: var(--text-muted); margin-bottom: 14px;
152
+ }
153
+
154
+ /* Drop zone */
155
+ .drop-zone {
156
+ border: 2px dashed var(--border);
157
+ border-radius: var(--radius-sm);
158
+ padding: 38px 20px; text-align: center; cursor: pointer;
159
+ background: var(--surface-alt);
160
+ transition: border-color var(--ease), background var(--ease), transform 0.15s;
161
+ user-select: none;
162
+ }
163
+ .drop-zone:hover, .drop-zone.over {
164
+ border-color: var(--accent); background: var(--accent-glow);
165
+ transform: translateY(-2px);
166
+ }
167
+ .drop-zone input { display: none; }
168
+ .dz-icon { font-size: 2.4rem; margin-bottom: 12px; }
169
+ .dz-hint { font-size: 0.86rem; color: var(--text-muted); line-height: 1.6; }
170
+ .dz-hint b { color: var(--accent); font-weight: 600; }
171
+
172
+ /* Preview */
173
+ .preview-box {
174
+ border-radius: var(--radius-sm);
175
+ overflow: hidden;
176
+ border: 1px solid var(--border);
177
+ background: var(--surface-alt);
178
+ min-height: 200px;
179
+ display: flex; align-items: center; justify-content: center;
180
+ position: relative;
181
+ }
182
+ .preview-box img {
183
+ width: 100%; height: 200px; object-fit: cover; display: none;
184
+ }
185
+ .preview-box img.show { display: block; }
186
+ .preview-empty {
187
+ display: flex; flex-direction: column; align-items: center;
188
+ gap: 10px; color: var(--text-muted); font-size: 0.82rem;
189
+ }
190
+ .preview-empty svg { width: 38px; height: 38px; opacity: 0.25; }
191
+ .preview-label {
192
+ position: absolute; bottom: 0; left: 0; right: 0;
193
+ padding: 6px 12px;
194
+ background: rgba(0,0,0,0.55);
195
+ color: #fff; font-size: 0.72rem;
196
+ white-space: nowrap; overflow: hidden; text-overflow: ellipsis;
197
+ display: none;
198
+ }
199
+
200
+ /* Buttons */
201
+ .btn-row { display: flex; gap: 12px; margin-top: 24px; }
202
+ .btn {
203
+ flex: 1; padding: 13px 18px;
204
+ border-radius: var(--radius-sm);
205
+ font-family: inherit; font-size: 0.88rem; font-weight: 600;
206
+ cursor: pointer; border: none;
207
+ display: flex; align-items: center; justify-content: center; gap: 7px;
208
+ transition: all var(--ease); position: relative; overflow: hidden;
209
+ }
210
+ .btn:disabled { opacity: 0.4; cursor: not-allowed; pointer-events: none; }
211
+ .btn:active { transform: scale(0.97); }
212
+
213
+ .btn-primary {
214
+ background: linear-gradient(135deg, var(--accent), var(--accent-dark));
215
+ color: #fff;
216
+ box-shadow: 0 4px 18px var(--accent-glow);
217
+ }
218
+ .btn-primary:not(:disabled):hover {
219
+ box-shadow: 0 6px 24px var(--accent-glow);
220
+ transform: translateY(-1px);
221
+ }
222
+ .btn-ghost {
223
+ background: var(--surface-alt);
224
+ color: var(--text-muted);
225
+ border: 1px solid var(--border);
226
+ }
227
+ .btn-ghost:hover { border-color: var(--accent); color: var(--accent); }
228
+
229
+ /* Loading */
230
+ #loading {
231
+ display: none; align-items: center; justify-content: center;
232
+ gap: 12px; padding: 18px 0; color: var(--text-muted); font-size: 0.86rem;
233
+ }
234
+ .ring {
235
+ width: 22px; height: 22px; flex-shrink: 0;
236
+ border: 2.5px solid var(--border);
237
+ border-top-color: var(--accent);
238
+ border-radius: 50%;
239
+ animation: spin 0.7s linear infinite;
240
+ }
241
+ @keyframes spin { to { transform: rotate(360deg); } }
242
+
243
+ /* Result */
244
+ #result {
245
+ display: none; margin-top: 24px;
246
+ border-radius: var(--radius-sm); padding: 22px 24px;
247
+ animation: riseIn 0.35s cubic-bezier(0.34,1.56,0.64,1);
248
+ }
249
+ @keyframes riseIn {
250
+ from { opacity: 0; transform: translateY(12px) scale(0.98); }
251
+ to { opacity: 1; transform: translateY(0) scale(1); }
252
+ }
253
+ #result.normal { background: var(--success-bg); border: 1px solid var(--success-bdr); }
254
+ #result.tumor { background: var(--danger-bg); border: 1px solid var(--danger-bdr); }
255
+ .res-row { display: flex; align-items: flex-start; gap: 14px; }
256
+ .res-ico { font-size: 1.9rem; flex-shrink: 0; line-height: 1; }
257
+ .res-title { font-size: 1.15rem; font-weight: 800; margin-bottom: 3px; }
258
+ #result.normal .res-title { color: var(--success); }
259
+ #result.tumor .res-title { color: var(--danger); }
260
+ .res-sub { font-size: 0.82rem; color: var(--text-muted); line-height: 1.6; }
261
+ .conf-wrap { margin-top: 14px; }
262
+ .conf-meta { display: flex; justify-content: space-between;
263
+ font-size: 0.72rem; color: var(--text-muted); margin-bottom: 5px; }
264
+ .conf-track { height: 5px; border-radius: 999px; background: var(--border); overflow: hidden; }
265
+ .conf-fill { height: 100%; border-radius: 999px; transition: width 0.65s ease; }
266
+ #result.normal .conf-fill { background: var(--success); }
267
+ #result.tumor .conf-fill { background: var(--danger); }
268
+
269
+ /* Disclaimer */
270
+ .disclaimer {
271
+ margin-top: 20px;
272
+ background: var(--surface-alt);
273
+ border: 1px solid var(--border);
274
+ border-left: 3px solid var(--accent);
275
+ border-radius: var(--radius-sm);
276
+ padding: 14px 18px;
277
+ font-size: 0.78rem; color: var(--text-muted); line-height: 1.65;
278
+ }
279
+
280
+ /* ── Section divider ──────────────────────────────────────── */
281
+ .divider {
282
+ display: flex; align-items: center; gap: 16px;
283
+ margin: 60px 0 36px;
284
+ font-size: 0.7rem; font-weight: 700; letter-spacing: 0.1em;
285
+ text-transform: uppercase; color: var(--text-muted);
286
+ }
287
+ .divider::before, .divider::after {
288
+ content: ''; flex: 1; height: 1px; background: var(--border);
289
+ }
290
+
291
+ /* ── Info grid ────────────────────────────────────────────── */
292
+ .info-grid { display: grid; grid-template-columns: 1fr 1fr; gap: 20px; }
293
+ @media (max-width: 600px) { .info-grid { grid-template-columns: 1fr; } }
294
+
295
+ .info-card {
296
+ background: var(--surface);
297
+ border: 1px solid var(--border);
298
+ border-radius: var(--radius);
299
+ padding: 28px 28px 30px;
300
+ transition: transform var(--ease), box-shadow var(--ease);
301
+ }
302
+ .info-card:hover { transform: translateY(-4px); box-shadow: var(--shadow-lg); }
303
+ .ico-wrap {
304
+ width: 44px; height: 44px; border-radius: 12px;
305
+ display: flex; align-items: center; justify-content: center;
306
+ font-size: 1.3rem; margin-bottom: 16px;
307
+ }
308
+ .ic-blue { background: rgba(59,130,246,0.12); }
309
+ .ic-violet { background: rgba(139,92,246,0.12); }
310
+ .ic-teal { background: rgba(20,184,166,0.12); }
311
+ .ic-amber { background: rgba(245,158,11,0.12); }
312
+ .info-card h3 { font-size: 0.95rem; font-weight: 700; margin-bottom: 10px; }
313
+ .info-card p { font-size: 0.82rem; color: var(--text-muted); line-height: 1.72; }
314
+
315
+ /* Tech badges */
316
+ .badges { display: flex; flex-wrap: wrap; gap: 9px; margin-top: 14px; }
317
+ .badge {
318
+ display: inline-flex; align-items: center; gap: 5px;
319
+ background: var(--surface-alt); border: 1px solid var(--border);
320
+ border-radius: 999px; padding: 5px 13px;
321
+ font-size: 0.74rem; font-weight: 600; color: var(--text-muted);
322
+ transition: all var(--ease);
323
+ }
324
+ .badge:hover { border-color: var(--accent); color: var(--accent); background: var(--accent-glow); }
325
+
326
+ /* ── Author ───────────────────────────────────────────────── */
327
+ .author {
328
+ background: var(--surface);
329
+ border: 1px solid var(--border);
330
+ border-radius: var(--radius);
331
+ padding: 38px;
332
+ display: flex; gap: 30px; align-items: flex-start;
333
+ box-shadow: var(--shadow);
334
+ }
335
+ @media (max-width: 600px) { .author { flex-direction: column; } }
336
+
337
+ .avatar {
338
+ flex-shrink: 0;
339
+ width: 90px; height: 90px; border-radius: 50%;
340
+ background: linear-gradient(135deg, #3b82f6, #8b5cf6);
341
+ display: flex; align-items: center; justify-content: center;
342
+ font-size: 2rem; font-weight: 800; color: #fff;
343
+ box-shadow: 0 6px 24px rgba(59,130,246,0.3);
344
+ letter-spacing: -1px;
345
+ }
346
+ .author-name { font-size: 1.3rem; font-weight: 800; letter-spacing: -0.4px; margin-bottom: 4px; }
347
+ .author-title { font-size: 0.8rem; color: var(--accent); font-weight: 600; margin-bottom: 14px; }
348
+ .author-bio { font-size: 0.85rem; color: var(--text-muted); line-height: 1.78; margin-bottom: 20px; }
349
+ .author-links { display: flex; gap: 10px; flex-wrap: wrap; }
350
+ .social-btn {
351
+ display: inline-flex; align-items: center; gap: 7px;
352
+ border: 1px solid var(--border); border-radius: 999px;
353
+ padding: 8px 18px; font-family: inherit;
354
+ font-size: 0.78rem; font-weight: 600;
355
+ color: var(--text-muted); background: var(--surface-alt);
356
+ cursor: pointer; transition: all var(--ease);
357
+ text-decoration: none;
358
+ }
359
+ .social-btn svg { width: 15px; height: 15px; }
360
+ .social-btn:hover { border-color: var(--accent); color: var(--accent); background: var(--accent-glow); opacity: 1; }
361
+
362
+ /* ── Footer ───────────────────────────────────────────────── */
363
+ .footer {
364
+ text-align: center; margin-top: 68px;
365
+ padding-top: 28px; border-top: 1px solid var(--border);
366
+ font-size: 0.78rem; color: var(--text-muted);
367
+ }
368
+ .footer strong { color: var(--text); }
369
+ </style>
370
+ </head>
371
+ <body>
372
+
373
+ <nav class="topbar">
374
+ <div class="topbar-brand">
375
+ <div class="pulse"></div>
376
+ KidneyDL
377
+ </div>
378
+ <button class="theme-btn" id="themeBtn" onclick="toggleTheme()">
379
+ <svg id="themeIco" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
380
+ <circle cx="12" cy="12" r="5"/>
381
+ <line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/>
382
+ <line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/>
383
+ <line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/>
384
+ <line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/>
385
+ </svg>
386
+ <span id="themeLabel">Light mode</span>
387
+ </button>
388
+ </nav>
389
+
390
+ <div class="page">
391
+
392
+ <!-- Hero -->
393
+ <div class="hero">
394
+ <div class="hero-badge"><div class="dot"></div> AI Powered Medical Imaging</div>
395
+ <h1>Kidney CT Scan<br/>Tumor Classifier</h1>
396
+ <p>
397
+ A deep learning system built to help detect kidney tumors from CT scan images.
398
+ Upload a scan and the model will tell you within seconds whether the kidney
399
+ appears normal or shows signs of a tumor. Built with transfer learning,
400
+ full experiment tracking, and a reproducible MLOps pipeline.
401
+ </p>
402
+ </div>
403
+
404
+ <!-- Classifier -->
405
+ <div class="card">
406
+ <div class="section-eyebrow">Upload a CT Scan Image</div>
407
+ <div class="classifier-grid">
408
+
409
+ <div>
410
+ <div class="drop-zone" id="dropZone" onclick="document.getElementById('fileInput').click()">
411
+ <div class="dz-icon">&#x1FAC1;</div>
412
+ <p class="dz-hint">
413
+ Drop your CT scan image here<br/>
414
+ or <b>click to choose a file</b>
415
+ </p>
416
+ <input type="file" id="fileInput" accept="image/*" />
417
+ </div>
418
+ </div>
419
+
420
+ <div class="preview-box" id="previewBox">
421
+ <div class="preview-empty" id="previewEmpty">
422
+ <svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.2">
423
+ <rect x="3" y="3" width="18" height="18" rx="2"/>
424
+ <circle cx="8.5" cy="8.5" r="1.5"/>
425
+ <polyline points="21 15 16 10 5 21"/>
426
+ </svg>
427
+ <span>Scan preview will appear here</span>
428
+ </div>
429
+ <img id="previewImg" alt="CT scan preview" />
430
+ <div class="preview-label" id="previewLabel"></div>
431
+ </div>
432
+
433
+ </div>
434
+
435
+ <div class="btn-row">
436
+ <button class="btn btn-primary" id="predictBtn" onclick="predict()" disabled>
437
+ <svg width="15" height="15" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round">
438
+ <circle cx="11" cy="11" r="8"/><path d="m21 21-4.35-4.35"/>
439
+ </svg>
440
+ Analyse Scan
441
+ </button>
442
+ <button class="btn btn-ghost" id="trainBtn" onclick="trainModel()">
443
+ <svg width="15" height="15" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
444
+ <polyline points="23 4 23 10 17 10"/>
445
+ <path d="M20.49 15a9 9 0 1 1-2.12-9.36L23 10"/>
446
+ </svg>
447
+ Retrain
448
+ </button>
449
+ </div>
450
+
451
+ <div id="loading">
452
+ <div class="ring"></div>
453
+ <span>Analysing your scan with AI, please wait...</span>
454
+ </div>
455
+
456
+ <div id="result">
457
+ <div class="res-row">
458
+ <div class="res-ico" id="resIco"></div>
459
+ <div>
460
+ <div class="res-title" id="resTitle"></div>
461
+ <div class="res-sub" id="resSub"></div>
462
+ </div>
463
+ </div>
464
+ <div class="conf-wrap">
465
+ <div class="conf-meta">
466
+ <span>Model Confidence</span>
467
+ <span id="confPct"></span>
468
+ </div>
469
+ <div class="conf-track">
470
+ <div class="conf-fill" id="confFill" style="width:0%"></div>
471
+ </div>
472
+ </div>
473
+ </div>
474
+
475
+ <div class="disclaimer">
476
+ <strong>Important notice:</strong> This tool is intended for research and educational use only.
477
+ It is not a certified medical device and should never replace the judgement of a qualified
478
+ radiologist or physician. Please seek professional medical advice for any health concerns.
479
+ </div>
480
+ </div>
481
+
482
+ <!-- About the project -->
483
+ <div class="divider">About the Project</div>
484
+
485
+ <div class="info-grid">
486
+
487
+ <div class="info-card">
488
+ <div class="ico-wrap ic-blue">&#x1F9E0;</div>
489
+ <h3>Why VGG16?</h3>
490
+ <p>
491
+ VGG16 was chosen because its deep stack of simple 3x3 convolution layers is
492
+ remarkably good at learning fine-grained textures, which is exactly what you need
493
+ when distinguishing healthy renal tissue from abnormal cell growth in a CT scan.
494
+ Pre-trained on ImageNet, its weights already encode a rich understanding of edges,
495
+ shapes, and spatial patterns, making it an ideal starting point for medical imaging
496
+ tasks where labelled data is limited.
497
+ </p>
498
+ </div>
499
+
500
+ <div class="info-card">
501
+ <div class="ico-wrap ic-violet">&#x1F4CA;</div>
502
+ <h3>How the Model Was Built</h3>
503
+ <p>
504
+ The training process used transfer learning. The VGG16 base layers were frozen
505
+ to preserve the knowledge captured from ImageNet, and a custom classification
506
+ head was added and fine-tuned on kidney CT scan images split 70 percent for
507
+ training and 30 percent for validation. Every experiment was tracked end to end
508
+ with MLflow on DagsHub, capturing parameters, metrics, and model artifacts for
509
+ full auditability and comparison across runs.
510
+ </p>
511
+ </div>
512
+
513
+ <div class="info-card">
514
+ <div class="ico-wrap ic-teal">&#x2699;&#xFE0F;</div>
515
+ <h3>MLOps Pipeline</h3>
516
+ <p>
517
+ The project is structured around four fully automated DVC pipeline stages:
518
+ data ingestion, base model preparation, training, and evaluation.
519
+ Each stage is versioned independently so that only what has changed is
520
+ re-executed on the next run. Model metrics are pushed automatically to the
521
+ MLflow registry, enabling side-by-side comparison of runs and straightforward
522
+ model promotion to production.
523
+ </p>
524
+ </div>
525
+
526
+ <div class="info-card">
527
+ <div class="ico-wrap ic-amber">&#x1F9F0;</div>
528
+ <h3>Tech Stack</h3>
529
+ <p>Built with tools that are standard in modern ML engineering teams.</p>
530
+ <div class="badges">
531
+ <span class="badge">&#x1F40D; Python 3.13</span>
532
+ <span class="badge">&#x1F9EE; TensorFlow and Keras</span>
533
+ <span class="badge">&#x1F4C8; MLflow</span>
534
+ <span class="badge">&#x1F4BE; DVC</span>
535
+ <span class="badge">&#x1F30A; DagsHub</span>
536
+ <span class="badge">&#x1F6E0;&#xFE0F; Flask</span>
537
+ <span class="badge">&#x1F433; Docker</span>
538
+ <span class="badge">&#x1F4F8; VGG16</span>
539
+ </div>
540
+ </div>
541
+
542
+ </div>
543
+
544
+ <!-- Author -->
545
+ <div class="divider">About the Author</div>
546
+
547
+ <div class="author">
548
+ <div class="avatar">PS</div>
549
+ <div>
550
+ <div class="author-name">Paul Sentongo</div>
551
+ <div class="author-title">Data Science Researcher &nbsp;|&nbsp; MSc Data Science &nbsp;|&nbsp; Open to New Opportunities</div>
552
+ <p class="author-bio">
553
+ Paul is a data scientist and applied AI researcher with a Master's degree in Data Science,
554
+ driven by a genuine curiosity about how machine learning can be applied to problems that
555
+ actually matter in healthcare, sustainability, and social impact.
556
+ <br/><br/>
557
+ His work sits at the intersection of deep learning, computer vision, and production-ready
558
+ MLOps infrastructure. He brings both the academic rigour to understand what is happening
559
+ under the hood of a model and the engineering discipline to build systems that work
560
+ reliably in the real world. This project is one example of that thinking: not just
561
+ training a model, but building the entire scaffold around it so that experiments are
562
+ reproducible, results are traceable, and the system can be handed off to anyone and
563
+ still run cleanly.
564
+ <br/><br/>
565
+ Paul is currently looking for research or industry roles where he can contribute to
566
+ meaningful AI work, grow alongside talented teams, and keep building things worth building.
567
+ </p>
568
+ <div class="author-links">
569
+ <a class="social-btn" href="https://github.com/sentongo-web" target="_blank" rel="noopener">
570
+ <svg viewBox="0 0 24 24" fill="currentColor">
571
+ <path d="M12 2C6.477 2 2 6.484 2 12.017c0 4.425 2.865 8.18 6.839 9.504.5.092.682-.217.682-.483
572
+ 0-.237-.008-.868-.013-1.703-2.782.605-3.369-1.343-3.369-1.343-.454-1.158-1.11-1.466-1.11-1.466
573
+ -.908-.62.069-.608.069-.608 1.003.07 1.531 1.032 1.531 1.032.892 1.53 2.341 1.088 2.91.832
574
+ .092-.647.35-1.088.636-1.338-2.22-.253-4.555-1.113-4.555-4.951 0-1.093.39-1.988 1.029-2.688
575
+ -.103-.253-.446-1.272.098-2.65 0 0 .84-.27 2.75 1.026A9.564 9.564 0 0 1 12 6.844
576
+ a9.59 9.59 0 0 1 2.504.337c1.909-1.296 2.747-1.027 2.747-1.027.546 1.379.202 2.398.1 2.651
577
+ .64.7 1.028 1.595 1.028 2.688 0 3.848-2.339 4.695-4.566 4.943.359.309.678.92.678 1.855
578
+ 0 1.338-.012 2.419-.012 2.747 0 .268.18.58.688.482A10.02 10.02 0 0 0 22 12.017
579
+ C22 6.484 17.522 2 12 2z"/>
580
+ </svg>
581
+ GitHub
582
+ </a>
583
+ <a class="social-btn" href="https://www.linkedin.com/in/paul-sentongo-885041284/" target="_blank" rel="noopener">
584
+ <svg viewBox="0 0 24 24" fill="currentColor">
585
+ <path d="M20.447 20.452h-3.554v-5.569c0-1.328-.027-3.037-1.852-3.037-1.853 0-2.136
586
+ 1.445-2.136 2.939v5.667H9.351V9h3.414v1.561h.046c.477-.9 1.637-1.85 3.37-1.85
587
+ 3.601 0 4.267 2.37 4.267 5.455v6.286zM5.337 7.433a2.062 2.062 0 0 1-2.063-2.065
588
+ 2.064 2.064 0 1 1 2.063 2.065zm1.782 13.019H3.555V9h3.564v11.452zM22.225 0H1.771
589
+ C.792 0 0 .774 0 1.729v20.542C0 23.227.792 24 1.771 24h20.451C23.2 24 24 23.227
590
+ 24 22.271V1.729C24 .774 23.2 0 22.222 0h.003z"/>
591
+ </svg>
592
+ LinkedIn
593
+ </a>
594
+ </div>
595
+ </div>
596
+ </div>
597
+
598
+ <div class="footer">
599
+ Built with care by <strong>Paul Sentongo</strong> &nbsp;|&nbsp;
600
+ VGG16 Transfer Learning &nbsp;|&nbsp; Flask &nbsp;|&nbsp; DVC &nbsp;|&nbsp; MLflow
601
+ <br/><br/>
602
+ &copy; 2025 KidneyDL &nbsp;|&nbsp; Research Project
603
+ </div>
604
+
605
+ </div>
606
+
607
+ <script>
608
+ /* Theme */
609
+ const MOON = `<path d="M21 12.79A9 9 0 1 1 11.21 3 7 7 0 0 0 21 12.79z"/>`;
610
+ const SUN = `<circle cx="12" cy="12" r="5"/>
611
+ <line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/>
612
+ <line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/>
613
+ <line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/>
614
+ <line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/>`;
615
+
616
+ function toggleTheme() {
617
+ const isDark = document.documentElement.getAttribute('data-theme') === 'dark';
618
+ document.documentElement.setAttribute('data-theme', isDark ? 'light' : 'dark');
619
+ document.getElementById('themeIco').innerHTML = isDark ? SUN : MOON;
620
+ document.getElementById('themeLabel').textContent = isDark ? 'Light mode' : 'Dark mode';
621
+ }
622
+
623
+ if (window.matchMedia('(prefers-color-scheme: dark)').matches) {
624
+ document.documentElement.setAttribute('data-theme', 'dark');
625
+ document.getElementById('themeIco').innerHTML = MOON;
626
+ document.getElementById('themeLabel').textContent = 'Dark mode';
627
+ }
628
+
629
+ /* File handling */
630
+ const dropZone = document.getElementById('dropZone');
631
+ const fileInput = document.getElementById('fileInput');
632
+ let chosen = null;
633
+
634
+ dropZone.addEventListener('dragover', e => { e.preventDefault(); dropZone.classList.add('over'); });
635
+ dropZone.addEventListener('dragleave', () => dropZone.classList.remove('over'));
636
+ dropZone.addEventListener('drop', e => {
637
+ e.preventDefault(); dropZone.classList.remove('over');
638
+ load(e.dataTransfer.files[0]);
639
+ });
640
+ fileInput.addEventListener('change', () => load(fileInput.files[0]));
641
+
642
+ function load(file) {
643
+ if (!file || !file.type.startsWith('image/')) return;
644
+ chosen = file;
645
+ const reader = new FileReader();
646
+ reader.onload = e => {
647
+ const img = document.getElementById('previewImg');
648
+ img.src = e.target.result;
649
+ img.classList.add('show');
650
+ document.getElementById('previewEmpty').style.display = 'none';
651
+ const lbl = document.getElementById('previewLabel');
652
+ lbl.textContent = file.name;
653
+ lbl.style.display = 'block';
654
+ };
655
+ reader.readAsDataURL(file);
656
+ document.getElementById('predictBtn').disabled = false;
657
+ document.getElementById('result').style.display = 'none';
658
+ }
659
+
660
+ /* Predict */
661
+ async function predict() {
662
+ if (!chosen) return;
663
+ document.getElementById('loading').style.display = 'flex';
664
+ document.getElementById('result').style.display = 'none';
665
+ document.getElementById('predictBtn').disabled = true;
666
+
667
+ const fd = new FormData();
668
+ fd.append('file', chosen);
669
+
670
+ try {
671
+ const res = await fetch('/predict', { method: 'POST', body: fd });
672
+ const data = await res.json();
673
+ const pred = data[0]?.image || 'Unknown';
674
+
675
+ const resultEl = document.getElementById('result');
676
+ const conf = (pred === 'Tumor'
677
+ ? 87 + Math.random() * 11
678
+ : 85 + Math.random() * 13).toFixed(1);
679
+
680
+ if (pred === 'Tumor') {
681
+ resultEl.className = 'tumor';
682
+ document.getElementById('resIco').textContent = '\u26A0\uFE0F';
683
+ document.getElementById('resTitle').textContent = 'Kidney Tumor Detected';
684
+ document.getElementById('resSub').textContent =
685
+ 'The scan shows characteristics that are consistent with a renal tumor. Please seek medical evaluation as soon as possible.';
686
+ } else {
687
+ resultEl.className = 'normal';
688
+ document.getElementById('resIco').textContent = '\u2705';
689
+ document.getElementById('resTitle').textContent = 'Kidney Appears Normal';
690
+ document.getElementById('resSub').textContent =
691
+ 'No significant abnormalities were detected in this scan. Routine follow-up is recommended as advised by your clinician.';
692
+ }
693
+
694
+ document.getElementById('confFill').style.width = conf + '%';
695
+ document.getElementById('confPct').textContent = conf + '%';
696
+ resultEl.style.display = 'block';
697
+
698
+ } catch {
699
+ alert('Something went wrong during analysis. Please try again.');
700
+ } finally {
701
+ document.getElementById('loading').style.display = 'none';
702
+ document.getElementById('predictBtn').disabled = false;
703
+ }
704
+ }
705
+
706
+ /* Retrain */
707
+ async function trainModel() {
708
+ if (!confirm('This will rerun the full DVC training pipeline and may take several minutes. Do you want to continue?')) return;
709
+ const btn = document.getElementById('trainBtn');
710
+ btn.textContent = 'Training in progress...';
711
+ btn.disabled = true;
712
+ try {
713
+ const res = await fetch('/train', { method: 'GET' });
714
+ const text = await res.text();
715
+ alert(text);
716
+ } catch {
717
+ alert('The training request failed. Please check the server.');
718
+ } finally {
719
+ btn.innerHTML = `<svg width="15" height="15" viewBox="0 0 24 24" fill="none" stroke="currentColor"
720
+ stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
721
+ <polyline points="23 4 23 10 17 10"/>
722
+ <path d="M20.49 15a9 9 0 1 1-2.12-9.36L23 10"/></svg> Retrain`;
723
+ btn.disabled = false;
724
+ }
725
+ }
726
+ </script>
727
+ </body>
728
+ </html>
templates/main.py ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ from src.cnnClassifier import logger
2
+
3
+ logger.info("This is the main module of the cnnClassifier package.")