Spaces:

Sentoz
/

kidney-classifier

Sleeping

App Files Files Community

Sentoz commited on Mar 1

Commit

3e93e14

verified ·

1 Parent(s): 37aa6b6

Deploy KidneyDL CT Scan Classifier

Browse files

Files changed (39) hide show

.dockerignore +49 -0
.dvc/.gitignore +3 -0
.dvc/config +0 -0
.dvcignore +3 -0
Dockerfile +34 -0
README.md +347 -11
app.py +44 -0
artifacts/prepare_base_model/base_model.h5 +3 -0
artifacts/prepare_base_model/base_model_updated.h5 +3 -0
artifacts/training/model.h5 +3 -0
config/config.yaml +29 -0
dvc.lock +148 -0
dvc.yaml +61 -0
main.py +41 -0
params.yaml +11 -0
requirements.txt +22 -0
setup.py +25 -0
src/cnnClassifier/__init__.py +20 -0
src/cnnClassifier/components/__init__.py +0 -0
src/cnnClassifier/components/data_ingestion.py +26 -0
src/cnnClassifier/components/model_evaluation_mlflow.py +63 -0
src/cnnClassifier/components/model_trainer.py +77 -0
src/cnnClassifier/components/prepare_base_model.py +59 -0
src/cnnClassifier/config/__init__.py +0 -0
src/cnnClassifier/config/configuration.py +67 -0
src/cnnClassifier/constants/__init__.py +4 -0
src/cnnClassifier/entity/__init__.py +0 -0
src/cnnClassifier/entity/config_entity.py +45 -0
src/cnnClassifier/pipeline/__init__.py +0 -0
src/cnnClassifier/pipeline/prediction.py +25 -0
src/cnnClassifier/pipeline/stage_01_data_ingestion.py +28 -0
src/cnnClassifier/pipeline/stage_02_prepare_base_model.py +28 -0
src/cnnClassifier/pipeline/stage_03_model_trainer.py +29 -0
src/cnnClassifier/pipeline/stage_04_model_evaluation.py +30 -0
src/cnnClassifier/utils/__init__.py +0 -0
src/cnnClassifier/utils/common.py +148 -0
template.py +38 -0
templates/index.html +728 -0
templates/main.py +3 -0

.dockerignore ADDED Viewed

	@@ -0,0 +1,49 @@

+# Git
+.git
+.gitignore
+# DVC internals
+.dvc/cache
+.dvc/tmp
+# CT scan training images are large and not needed inside the container.
+# The trained model at artifacts/training/model.h5 is kept so the container
+# can serve predictions without a volume mount.
+artifacts/data_ingestion/
+# Python caches
+__pycache__
+*.py[cod]
+*.pyo
+*.pyd
+.Python
+# Virtual environments and Conda
+.venv
+venv
+env
+*.egg-info
+dist
+build
+# Jupyter notebooks (not needed in the container)
+research/
+*.ipynb
+.ipynb_checkpoints
+# Logs
+logs/
+*.log
+# Secrets (never bake credentials into the image)
+.env
+# Test and dev artifacts
+uploads/
+scores.json
+# Editor and OS noise
+.vscode
+.idea
+*.DS_Store
+Thumbs.db

.dvc/.gitignore ADDED Viewed

	@@ -0,0 +1,3 @@

+/config.local
+/tmp
+/cache

.dvc/config ADDED Viewed

File without changes

.dvcignore ADDED Viewed

	@@ -0,0 +1,3 @@

+# Add patterns of files dvc should ignore, which could improve
+# the performance. Learn more at
+# https://dvc.org/doc/user-guide/dvcignore

Dockerfile ADDED Viewed

	@@ -0,0 +1,34 @@

+FROM python:3.10-slim
+# Keeps Python output unbuffered so logs appear immediately in Docker
+ENV PYTHONUNBUFFERED=1 \
+    PYTHONUTF8=1 \
+    PIP_NO_CACHE_DIR=1 \
+    PIP_DISABLE_PIP_VERSION_CHECK=1
+WORKDIR /app
+# System libraries required by TensorFlow, OpenCV, and image processing
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    libglib2.0-0 \
+    libsm6 \
+    libxrender1 \
+    libxext6 \
+    libgl1 \
+    && rm -rf /var/lib/apt/lists/*
+# Install Python dependencies first so this layer is cached between code changes
+COPY requirements.txt .
+RUN pip install --upgrade pip && \
+    pip install -r requirements.txt
+# Copy the full project and install the cnnClassifier package
+COPY . .
+RUN pip install -e .
+# Directory for uploaded scan images at runtime
+RUN mkdir -p uploads
+EXPOSE 7860
+CMD ["python", "app.py"]

README.md CHANGED Viewed

@@ -1,11 +1,347 @@
----
-title: Kidney Classifier
-emoji: 🌍
-colorFrom: blue
-colorTo: gray
-sdk: docker
-pinned: false
-license: mit
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+---
+title: KidneyDL CT Scan Classifier
+emoji: 🫁
+colorFrom: blue
+colorTo: indigo
+sdk: docker
+app_port: 7860
+pinned: true
+license: mit
+---
+## KidneyAI: End-to-End Kidney CT Scan Classification with MLOps
+[![Python](https://img.shields.io/badge/Python-3.13-blue?logo=python&logoColor=white)](https://www.python.org/)
+[![TensorFlow](https://img.shields.io/badge/TensorFlow-2.x-orange?logo=tensorflow&logoColor=white)](https://www.tensorflow.org/)
+[![DVC](https://img.shields.io/badge/DVC-Pipeline%20Versioning-945DD6?logo=dvc&logoColor=white)](https://dvc.org/)
+[![MLflow](https://img.shields.io/badge/MLflow-Experiment%20Tracking-0194E2?logo=mlflow&logoColor=white)](https://mlflow.org/)
+[![DagsHub](https://img.shields.io/badge/DagsHub-Remote%20Tracking-FF6B35?logoColor=white)](https://dagshub.com/)
+[![Flask](https://img.shields.io/badge/Flask-Web%20App-000000?logo=flask&logoColor=white)](https://flask.palletsprojects.com/)
+[![Docker](https://img.shields.io/badge/Docker-Containerised-2496ED?logo=docker&logoColor=white)](https://www.docker.com/)
+---
+## What This Project Is
+This is a production-style, end-to-end machine learning project that classifies kidney CT scan images as either **Normal** or **Tumor**. But the model itself is only one piece of the story. The real focus of this project is everything that surrounds it: a fully reproducible DVC pipeline, experiment tracking with MLflow and DagsHub, a clean configuration-driven codebase, a Flask web application, and a Dockerised deployment setup.
+It was built to demonstrate what a real MLOps workflow looks like in practice, not just the notebook that produces a metric, but the entire system that allows a model to be trained, evaluated, versioned, and served reliably.
+---
+## The Problem
+Kidney disease is among the leading causes of death globally, and it often goes undetected until its later stages when treatment options become limited. Radiologists manually reviewing CT scans are under enormous pressure, and any tool that can reliably flag suspicious scans for closer attention has genuine clinical value.
+This project builds a binary image classifier that can look at a kidney CT scan and tell you, within seconds, whether the kidney appears normal or shows signs of a tumor. It is trained on a labelled CT scan dataset and achieves approximately **89.9% validation accuracy** using a fine-tuned VGG16 network.
+---
+## Why VGG16?
+VGG16 was selected deliberately, not arbitrarily. Here is the reasoning:
+Its architecture is built from uniform 3x3 convolutional layers stacked into increasing depth. This design is especially good at learning fine-grained local textures, which is critical in medical imaging where the difference between healthy and abnormal tissue often comes down to subtle structural patterns rather than large-scale shape differences.
+Pre-trained on ImageNet, VGG16 already knows how to see. Its lower layers encode general-purpose feature detectors for edges, corners, and textures. Those weights do not need to be learned from scratch. Only the top classification layers need to be adapted to the kidney scan domain, which means the model can achieve strong performance with far less labelled data than training from scratch would require.
+It is also a stable, well-understood architecture. In a medical context, that matters. The behaviour of the model is predictable, and the features it learns can be interpreted through tools like Grad-CAM.
+---
+## Model Performance
+| Metric   | Value  |
+|----------|--------|
+| Accuracy | 89.9%  |
+| Loss     | 1.26   |
+Metrics are logged automatically to MLflow after every pipeline run. You can view all experiment runs, compare parameters, and download model artifacts directly from the DagsHub MLflow UI.
+---
+## Project Structure
+```text
+Kidney_classification_Using_MLOPS_and_DVC/
+│
+├── config/
+│   └── config.yaml                  Central path and artifact configuration
+│
+├── params.yaml                      All model hyperparameters in one place
+├── dvc.yaml                         DVC pipeline stage definitions
+├── dvc.lock                         DVC lock file tracking stage state
+├── main.py                          Runs all pipeline stages sequentially
+├── app.py                           Flask web application
+├── Dockerfile                       Container definition for the prediction server
+├── requirements.txt                 Python dependencies
+├── setup.py                         Installable package definition
+├── scores.json                      Latest evaluation metrics
+│
+├── src/cnnClassifier/
+│   ├── __init__.py                  Logger setup
+│   ├── constants/                   Project-wide constants (config file paths)
+│   ├── entity/
+│   │   └── config_entity.py         Typed dataclasses for each pipeline stage config
+│   ├── config/
+│   │   └── configuration.py         ConfigurationManager: reads YAML and builds configs
+│   ├── utils/
+│   │   └── common.py                Shared utilities: YAML reading, directory creation, JSON saving
+│   ├── components/
+│   │   ├── data_ingestion.py         Downloads and extracts the dataset
+│   │   ├── prepare_base_model.py     Loads VGG16 and adds the classification head
+│   │   ├── model_trainer.py          Trains the model with augmentation support
+│   │   └── model_evaluation_mlflow.py Evaluates and logs to MLflow via DagsHub
+│   └── pipeline/
+│       ├── stage_01_data_ingestion.py
+│       ├── stage_02_prepare_base_model.py
+│       ├── stage_03_model_trainer.py
+│       ├── stage_04_model_evaluation.py
+│       └── prediction.py             Prediction pipeline used by the Flask app
+│
+├── research/
+│   ├── 01_data_ingestion.ipynb
+│   ├── 02_prepare_base_model.ipynb
+│   ├── 03_model_trainer.ipynb
+│   └── 04_model_evaluation.ipynb     Each stage was prototyped here first
+│
+└── templates/
+    └── index.html                    Web UI for the prediction app
+```
+---
+## The ML Pipeline
+The pipeline has four stages, each defined in `dvc.yaml` and executed in order by DVC.
+```text
+Stage 1          Stage 2                  Stage 3           Stage 4
+Data Ingestion   Base Model Preparation   Model Training    Model Evaluation
+```
+### Stage 1: Data Ingestion
+Downloads the kidney CT scan dataset from Google Drive using `gdown`, extracts the zip archive, and places the images into the `artifacts/data_ingestion/` directory. DVC tracks the output so this stage is skipped if the data already exists and nothing has changed.
+### Stage 2: Base Model Preparation
+Loads VGG16 with ImageNet weights and without its top classification layers. Adds a custom head: a global average pooling layer followed by a dense output layer with softmax activation for the two classes, Normal and Tumor. The base VGG16 layers are frozen. The resulting model is saved to disk so the training stage can pick it up.
+### Stage 3: Model Training
+Loads the prepared base model, recompiles it with an SGD optimiser, and trains it on the kidney CT images. Supports data augmentation (horizontal flip, zoom, shear) to improve generalisation. The trained model is saved as `artifacts/training/model.h5`.
+### Stage 4: Model Evaluation
+Loads the trained model and evaluates it against the 30 percent validation split. Loss and accuracy are saved to `scores.json` and logged to MLflow. The model is also registered in the MLflow Model Registry under the name `VGG16Model`.
+---
+## Experiment Tracking with MLflow and DagsHub
+All runs are tracked remotely on DagsHub, which acts as the MLflow tracking server. Every time the evaluation stage runs, it logs:
+- All hyperparameters from `params.yaml`
+- Validation loss and accuracy
+- The trained model as an MLflow artifact
+- A registered model version in the MLflow Model Registry
+You can view the experiment runs at:
+[https://dagshub.com/sentongo-web/Kidney_classification_Using_MLOPS_and_DVC_Data-version-control.mlflow](https://dagshub.com/sentongo-web/Kidney_classification_Using_MLOPS_and_DVC_Data-version-control.mlflow)
+---
+## Configuration
+Everything is driven by two YAML files. There are no hardcoded paths or hyperparameters anywhere in the source code.
+**`config/config.yaml`** manages all file paths and artifact locations:
+```yaml
+artifacts_root: artifacts
+data_ingestion:
+  root_dir: artifacts/data_ingestion
+  source_URL: "https://drive.google.com/file/d/16PZpADG4Pl_SBr2E3DEcvXsLQ5DSUtDP/view?usp=sharing"
+  local_data_file: artifacts/data_ingestion/data.zip
+  unzip_dir: artifacts/data_ingestion
+prepare_base_model:
+  root_dir: artifacts/prepare_base_model
+  base_model_path: artifacts/prepare_base_model/base_model.h5
+  updated_base_model_path: artifacts/prepare_base_model/base_model_updated.h5
+training:
+  root_dir: artifacts/training
+  trained_model_path: artifacts/training/model.h5
+evaluation:
+  path_of_model: artifacts/training/model.h5
+  training_data: artifacts/data_ingestion/kidney-ct-scan-image
+  mlflow_uri: "https://dagshub.com/sentongo-web/Kidney_classification_Using_MLOPS_and_DVC_Data-version-control.mlflow"
+  all_params:
+    AUGMENTATION: True
+    IMAGE_SIZE: [224, 224, 3]
+    BATCH_SIZE: 16
+    INCLUDE_TOP: False
+    EPOCHS: 5
+    CLASSES: 2
+    WEIGHTS: imagenet
+    LEARNING_RATE: 0.01
+```
+**`params.yaml`** is where all model hyperparameters live:
+```yaml
+AUGMENTATION: True
+IMAGE_SIZE: [224, 224, 3]
+BATCH_SIZE: 16
+INCLUDE_TOP: False
+EPOCHS: 5
+CLASSES: 2
+WEIGHTS: imagenet
+LEARNING_RATE: 0.01
+```
+---
+## How to Run Locally
+### 1. Clone the repository
+```bash
+git clone https://github.com/sentongo-web/Kidney_classification_Using_MLOPS_and_DVC_Data-version-control.git
+cd Kidney_classification_Using_MLOPS_and_DVC_Data-version-control
+```
+### 2. Create and activate a Conda environment
+```bash
+conda create -n kidney python=3.13 -y
+conda activate kidney
+```
+### 3. Install dependencies
+```bash
+pip install -r requirements.txt
+pip install -e .
+```
+### 4. Set up your MLflow credentials
+Create a `.env` file in the project root with your DagsHub token:
+```env
+MLFLOW_TRACKING_USERNAME=your_dagshub_username
+MLFLOW_TRACKING_PASSWORD=your_dagshub_token
+```
+This file is gitignored and will never be committed.
+### 5. Run the full pipeline
+```bash
+dvc repro
+```
+DVC will execute all four stages in order. If any stage has already run and its inputs have not changed, it will be skipped automatically. After the pipeline finishes, `scores.json` will contain the latest evaluation metrics.
+### 6. Launch the web application
+```bash
+python app.py
+```
+Open your browser and go to `http://localhost:8080`. You can upload a kidney CT scan image and get a classification result instantly.
+### 7. View experiment runs
+```bash
+mlflow ui
+```
+Open `http://localhost:5000` to browse all local experiment runs, or visit the DagsHub MLflow URL above to see all remotely tracked runs.
+---
+## Run with Docker
+```bash
+docker build -t kidney-classifier .
+docker run -p 8080:8080 kidney-classifier
+```
+Open `http://localhost:8080` in your browser.
+---
+## The Web Application
+The Flask app exposes three routes:
+| Route     | Method | Description                                                         |
+| --------- | ------ | ------------------------------------------------------------------- |
+| `/`       | GET    | Serves the prediction web UI                                        |
+| `/predict`| POST   | Accepts an image file and returns the classification result as JSON |
+| `/train`  | GET    | Reruns `main.py` to retrain the model from scratch                  |
+The prediction endpoint returns a response like this:
+```json
+[{"image": "Normal"}]
+```
+or
+```json
+[{"image": "Tumor"}]
+```
+The UI supports drag and drop, shows a live preview of the uploaded scan, displays the result with a confidence bar, and works in both light and dark mode with automatic detection of your system preference.
+---
+## Tech Stack
+| Area                | Tools                                             |
+| ------------------- | ------------------------------------------------- |
+| Deep Learning       | TensorFlow and Keras with VGG16 transfer learning |
+| Data Versioning     | DVC                                               |
+| Experiment Tracking | MLflow hosted on DagsHub                          |
+| Web Framework       | Flask with Flask-CORS                             |
+| Data Processing     | NumPy, Pandas, scikit-learn                       |
+| Configuration       | PyYAML and python-box                             |
+| Package Management  | setuptools with src layout, editable install      |
+| Containerisation    | Docker                                            |
+| Environment         | Conda with pip                                    |
+---
+## MLOps Concepts Demonstrated
+| Concept                  | How it is implemented                                                           |
+| ------------------------ | ------------------------------------------------------------------------------- |
+| Data versioning          | DVC tracks the dataset and all model artifacts                                  |
+| Pipeline as code         | `dvc.yaml` defines every stage and its dependencies                             |
+| Incremental execution    | DVC only reruns stages whose inputs have changed                                |
+| Experiment tracking      | MLflow logs parameters, metrics, and model artifacts on every run               |
+| Model registry           | Trained models are registered and versioned in the MLflow Model Registry        |
+| Configuration management | All paths and hyperparameters live in YAML files with no hardcoded values       |
+| Modular ML package       | Source code is structured as an installable Python package                      |
+| Reproducibility          | Any contributor can clone the repo and run `dvc repro` to get identical results |
+| Containerisation         | Dockerfile ensures the app runs consistently in any environment                 |
+| REST API serving         | Flask wraps the prediction pipeline and exposes it over HTTP                    |
+---
+## About the Author
+**Paul Sentongo** is a data scientist and applied AI researcher with a Master's degree in Data Science. He is passionate about building machine learning systems that go beyond the notebook: reproducible, traceable, and deployable. His research interests include deep learning for medical imaging, MLOps infrastructure, and the practical challenges of making AI work in the real world.
+Paul is currently open to research positions and industry roles where he can contribute to meaningful AI projects and grow alongside motivated teams.
+- GitHub: [github.com/sentongo-web](https://github.com/sentongo-web)
+- LinkedIn: [linkedin.com/in/paul-sentongo-885041284](https://www.linkedin.com/in/paul-sentongo-885041284/)
+- Email: sentongogray1992@gmail.com

app.py ADDED Viewed

	@@ -0,0 +1,44 @@

+import os
+from flask import Flask, request, jsonify, render_template
+from flask_cors import CORS
+from cnnClassifier.pipeline.prediction import PredictionPipeline
+app = Flask(__name__)
+CORS(app)
+UPLOAD_FOLDER = "uploads"
+os.makedirs(UPLOAD_FOLDER, exist_ok=True)
+@app.route("/", methods=["GET"])
+def home():
+    return render_template("index.html")
+@app.route("/train", methods=["GET", "POST"])
+def train():
+    os.system("python main.py")
+    return "Training completed successfully!"
+@app.route("/predict", methods=["POST"])
+def predict():
+    if "file" not in request.files:
+        return jsonify({"error": "No file uploaded"}), 400
+    file = request.files["file"]
+    if file.filename == "":
+        return jsonify({"error": "No file selected"}), 400
+    filepath = os.path.join(UPLOAD_FOLDER, file.filename)
+    file.save(filepath)
+    pipeline = PredictionPipeline(filepath)
+    result = pipeline.predict()
+    return jsonify(result)
+if __name__ == "__main__":
+    port = int(os.environ.get("PORT", 7860))
+    app.run(host="0.0.0.0", port=port, debug=False)

artifacts/prepare_base_model/base_model.h5 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f6fc070728f3f1ce3d0f140b0ffeee893dc470fec846f5aaa0246f315c2fcb6b
+size 58926080

artifacts/prepare_base_model/base_model_updated.h5 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f9b58a14a5bb7222c23e8400d8b8e820e4557588d47cbe666b28f9c89313f6bc
+size 59147544

artifacts/training/model.h5 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8e1b5f5c330dcc32a5c98f34893464c63d8dc755cec7fe0bef0a449168cf7b2f
+size 59147544

config/config.yaml ADDED Viewed

	@@ -0,0 +1,29 @@

+artifacts_root: artifacts
+data_ingestion:
+  root_dir: artifacts/data_ingestion
+  source_URL: "https://drive.google.com/file/d/16PZpADG4Pl_SBr2E3DEcvXsLQ5DSUtDP/view?usp=sharing"
+  local_data_file: artifacts/data_ingestion/data.zip
+  unzip_dir: artifacts/data_ingestion
+prepare_base_model:
+  root_dir: artifacts/prepare_base_model
+  base_model_path: artifacts/prepare_base_model/base_model.h5
+  updated_base_model_path: artifacts/prepare_base_model/base_model_updated.h5
+training:
+  root_dir: artifacts/training
+  trained_model_path: artifacts/training/model.h5
+evaluation:
+  path_of_model: artifacts/training/model.h5
+  training_data: artifacts/data_ingestion/kidney-ct-scan-image
+  mlflow_uri: "https://dagshub.com/sentongo-web/Kidney_classification_Using_MLOPS_and_DVC_Data-version-control.mlflow"
+  all_params:
+    AUGMENTATION: True
+    IMAGE_SIZE: [224, 224, 3]
+    BATCH_SIZE: 16
+    INCLUDE_TOP: False
+    EPOCHS: 5
+    CLASSES: 2
+    WEIGHTS: imagenet
+    LEARNING_RATE: 0.01

dvc.lock ADDED Viewed

	@@ -0,0 +1,148 @@

+schema: '2.0'
+stages:
+  data_ingestion:
+    cmd: python src/cnnClassifier/pipeline/stage_01_data_ingestion.py
+    deps:
+    - path: config/config.yaml
+      hash: md5
+      md5: 20cd3ab789ce919b3687442bc4f2ab85
+      size: 1016
+    - path: src/cnnClassifier/components/data_ingestion.py
+      hash: md5
+      md5: f07cc7fed589b1f7d14a637aa94a0433
+      size: 1039
+    - path: src/cnnClassifier/config/configuration.py
+      hash: md5
+      md5: d41b12b3ad8d16ad963a9a34118da1ec
+      size: 2873
+    - path: src/cnnClassifier/pipeline/stage_01_data_ingestion.py
+      hash: md5
+      md5: e501eeb64cda076b3e15b447e55d6463
+      size: 908
+    outs:
+    - path: artifacts/data_ingestion
+      hash: md5
+      md5: 86510b1e2ff6da777357ccfdc278e4c8.dir
+      size: 116493584
+      nfiles: 466
+  prepare_base_model:
+    cmd: python src/cnnClassifier/pipeline/stage_02_prepare_base_model.py
+    deps:
+    - path: config/config.yaml
+      hash: md5
+      md5: 20cd3ab789ce919b3687442bc4f2ab85
+      size: 1016
+    - path: params.yaml
+      hash: md5
+      md5: 156a3b540bf80876a34e08b09faaf4fb
+      size: 151
+    - path: src/cnnClassifier/components/prepare_base_model.py
+      hash: md5
+      md5: 3c5230f332299193cb460420f3ce5057
+      size: 2063
+    - path: src/cnnClassifier/config/configuration.py
+      hash: md5
+      md5: d41b12b3ad8d16ad963a9a34118da1ec
+      size: 2873
+    - path: src/cnnClassifier/pipeline/stage_02_prepare_base_model.py
+      hash: md5
+      md5: c886276ae57285dac8969b02bf9077ed
+      size: 954
+    params:
+      params.yaml:
+        CLASSES: 2
+        IMAGE_SIZE:
+        - 224
+        - 224
+        - 3
+        INCLUDE_TOP: false
+        LEARNING_RATE: 0.01
+        WEIGHTS: imagenet
+    outs:
+    - path: artifacts/prepare_base_model
+      hash: md5
+      md5: d761158dc61a51df0233a4d98a02499f.dir
+      size: 118073624
+      nfiles: 2
+  training:
+    cmd: python src/cnnClassifier/pipeline/stage_03_model_trainer.py
+    deps:
+    - path: artifacts/data_ingestion/kidney-ct-scan-image
+      hash: md5
+      md5: 33ed59dbe5dec8ce2bb8e489b55203e4.dir
+      size: 58936381
+      nfiles: 465
+    - path: artifacts/prepare_base_model/base_model_updated.h5
+      hash: md5
+      md5: 12a1e3ebb90d89346ff2beb4fa21053b
+      size: 59147544
+    - path: config/config.yaml
+      hash: md5
+      md5: 20cd3ab789ce919b3687442bc4f2ab85
+      size: 1016
+    - path: params.yaml
+      hash: md5
+      md5: 156a3b540bf80876a34e08b09faaf4fb
+      size: 151
+    - path: src/cnnClassifier/components/model_trainer.py
+      hash: md5
+      md5: bc19f92e2812f36ba12a7c66730a6e21
+      size: 2675
+    - path: src/cnnClassifier/config/configuration.py
+      hash: md5
+      md5: d41b12b3ad8d16ad963a9a34118da1ec
+      size: 2873
+    - path: src/cnnClassifier/pipeline/stage_03_model_trainer.py
+      hash: md5
+      md5: 0951b497a475aac360e1c73b347b2295
+      size: 885
+    params:
+      params.yaml:
+        AUGMENTATION: true
+        BATCH_SIZE: 16
+        EPOCHS: 5
+        IMAGE_SIZE:
+        - 224
+        - 224
+        - 3
+        LEARNING_RATE: 0.01
+    outs:
+    - path: artifacts/training/model.h5
+      hash: md5
+      md5: 87e2a46b9573a6bba1da41192f0dff18
+      size: 59147544
+  evaluation:
+    cmd: python src/cnnClassifier/pipeline/stage_04_model_evaluation.py
+    deps:
+    - path: artifacts/training/model.h5
+      hash: md5
+      md5: 87e2a46b9573a6bba1da41192f0dff18
+      size: 59147544
+    - path: config/config.yaml
+      hash: md5
+      md5: 20cd3ab789ce919b3687442bc4f2ab85
+      size: 1016
+    - path: src/cnnClassifier/components/model_evaluation_mlflow.py
+      hash: md5
+      md5: 4612a6a44af8961549348656ece0c848
+      size: 2306
+    - path: src/cnnClassifier/config/configuration.py
+      hash: md5
+      md5: d41b12b3ad8d16ad963a9a34118da1ec
+      size: 2873
+    - path: src/cnnClassifier/pipeline/stage_04_model_evaluation.py
+      hash: md5
+      md5: dff6dc7d6115804b764c98c53f3bbc43
+      size: 908
+    params:
+      params.yaml:
+        BATCH_SIZE: 16
+        IMAGE_SIZE:
+        - 224
+        - 224
+        - 3
+    outs:
+    - path: scores.json
+      hash: md5
+      md5: 1fad9e68e2b1611a3fc59b064c62106e
+      size: 72

dvc.yaml ADDED Viewed

	@@ -0,0 +1,61 @@

+stages:
+  data_ingestion:
+    cmd: python src/cnnClassifier/pipeline/stage_01_data_ingestion.py
+    deps:
+      - src/cnnClassifier/pipeline/stage_01_data_ingestion.py
+      - src/cnnClassifier/components/data_ingestion.py
+      - src/cnnClassifier/config/configuration.py
+      - config/config.yaml
+    outs:
+      - artifacts/data_ingestion
+  prepare_base_model:
+    cmd: python src/cnnClassifier/pipeline/stage_02_prepare_base_model.py
+    deps:
+      - src/cnnClassifier/pipeline/stage_02_prepare_base_model.py
+      - src/cnnClassifier/components/prepare_base_model.py
+      - src/cnnClassifier/config/configuration.py
+      - config/config.yaml
+      - params.yaml
+    params:
+      - IMAGE_SIZE
+      - INCLUDE_TOP
+      - CLASSES
+      - WEIGHTS
+      - LEARNING_RATE
+    outs:
+      - artifacts/prepare_base_model
+  training:
+    cmd: python src/cnnClassifier/pipeline/stage_03_model_trainer.py
+    deps:
+      - src/cnnClassifier/pipeline/stage_03_model_trainer.py
+      - src/cnnClassifier/components/model_trainer.py
+      - src/cnnClassifier/config/configuration.py
+      - config/config.yaml
+      - params.yaml
+      - artifacts/data_ingestion/kidney-ct-scan-image
+      - artifacts/prepare_base_model/base_model_updated.h5
+    params:
+      - IMAGE_SIZE
+      - EPOCHS
+      - BATCH_SIZE
+      - AUGMENTATION
+      - LEARNING_RATE
+    outs:
+      - artifacts/training/model.h5
+  evaluation:
+    cmd: python src/cnnClassifier/pipeline/stage_04_model_evaluation.py
+    deps:
+      - src/cnnClassifier/pipeline/stage_04_model_evaluation.py
+      - src/cnnClassifier/components/model_evaluation_mlflow.py
+      - src/cnnClassifier/config/configuration.py
+      - config/config.yaml
+      - artifacts/training/model.h5
+    params:
+      - IMAGE_SIZE
+      - BATCH_SIZE
+    metrics:
+      - scores.json:
+          cache: false

main.py ADDED Viewed

	@@ -0,0 +1,41 @@

+from cnnClassifier import logger
+from cnnClassifier.pipeline.stage_01_data_ingestion import DataIngestionTrainingPipeline
+from cnnClassifier.pipeline.stage_02_prepare_base_model import PrepareBaseModelTrainingPipeline
+from cnnClassifier.pipeline.stage_03_model_trainer import ModelTrainerTrainingPipeline
+from cnnClassifier.pipeline.stage_04_model_evaluation import ModelEvaluationPipeline
+STAGE_NAME = "Data Ingestion stage"
+try:
+    logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
+    DataIngestionTrainingPipeline().main()
+    logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<" + chr(10) + chr(10) + "x==========x")
+except Exception as e:
+    logger.exception(e)
+    raise e
+STAGE_NAME = "Prepare Base Model stage"
+try:
+    logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
+    PrepareBaseModelTrainingPipeline().main()
+    logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<" + chr(10) + chr(10) + "x==========x")
+except Exception as e:
+    logger.exception(e)
+    raise e
+STAGE_NAME = "Training stage"
+try:
+    logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
+    ModelTrainerTrainingPipeline().main()
+    logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<" + chr(10) + chr(10) + "x==========x")
+except Exception as e:
+    logger.exception(e)
+    raise e
+STAGE_NAME = "Model Evaluation stage"
+try:
+    logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
+    ModelEvaluationPipeline().main()
+    logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<" + chr(10) + chr(10) + "x==========x")
+except Exception as e:
+    logger.exception(e)
+    raise e

params.yaml ADDED Viewed

	@@ -0,0 +1,11 @@

+AUGMENTATION: True
+IMAGE_SIZE:
+- 224
+- 224
+- 3
+BATCH_SIZE: 16
+INCLUDE_TOP: False
+EPOCHS: 5
+CLASSES: 2
+WEIGHTS: imagenet
+LEARNING_RATE: 0.01

requirements.txt ADDED Viewed

	@@ -0,0 +1,22 @@

+tensorflow
+keras
+dvc
+numpy
+pandas
+scikit-learn
+matplotlib
+seaborn
+jupyterlab
+scipy
+mlflow
+notebook
+python-box
+pyYAML
+tqdm
+joblib
+types-PyYAML
+Flask
+Flask-Cors
+gdown
+ensure
+python-dotenv

setup.py ADDED Viewed

	@@ -0,0 +1,25 @@

+import setuptools
+with open("README.md", "r", encoding="utf-8") as f:
+    long_description = f.read()
+__version__ = "0.0.0"
+REPO_NAME = "Kidney_classification_Using_MLOPS_and_DVC_Data-version-control"
+AUTHOR_USER_NAME = "sentongo-web"
+SRC_REPO = "cnnClassifier"
+AUTHOR_EMAIL = "sentongogray1992@gmail.com"
+setuptools.setup(
+    name=SRC_REPO,
+    version=__version__,
+    author=AUTHOR_USER_NAME,
+    author_email=AUTHOR_EMAIL,
+    description="A machine learning project for kidney classification using MLOps and DVC.",
+    long_description=long_description,
+    long_description_content_type="text/markdown",
+    url=f"https://github.com/{AUTHOR_USER_NAME}/{REPO_NAME}",
+    package_dir={"": "src"},
+    packages=setuptools.find_packages(where="src")
+)

src/cnnClassifier/__init__.py ADDED Viewed

	@@ -0,0 +1,20 @@

+import os
+import sys
+import logging
+logging_str = "[%(asctime)s: %(levelname)s: %(module)s]: %(message)s"
+log_dir = "logs"
+log_filepath = os.path.join(log_dir,"running_logs.log")
+os.makedirs(log_dir, exist_ok=True)
+logging.basicConfig(
+    level=logging.INFO,
+    format=logging_str,
+    handlers=[
+        logging.FileHandler(log_filepath),
+        logging.StreamHandler(sys.stdout) # type: ignore
+    ]
+)
+logger = logging.getLogger("cnnClassifierLogger")

src/cnnClassifier/components/__init__.py ADDED Viewed

File without changes

src/cnnClassifier/components/data_ingestion.py ADDED Viewed

	@@ -0,0 +1,26 @@

+import os
+import zipfile
+import gdown
+from pathlib import Path
+from cnnClassifier import logger
+from cnnClassifier.utils.common import get_size
+from cnnClassifier.entity.config_entity import DataIngestionConfig
+class DataIngestion:
+    def __init__(self, config: DataIngestionConfig):
+        self.config = config
+    def download_file(self):
+        if not os.path.exists(self.config.local_data_file):
+            gdown.download(self.config.source_URL, str(self.config.local_data_file), quiet=False, fuzzy=True)
+            logger.info(f"Downloaded data to {self.config.local_data_file}")
+        else:
+            logger.info(f"File already exists of size: {get_size(Path(self.config.local_data_file))}")
+    def extract_zip_file(self):
+        """Extracts the zip file into the unzip directory."""
+        unzip_path = self.config.unzip_dir
+        os.makedirs(unzip_path, exist_ok=True)
+        with zipfile.ZipFile(self.config.local_data_file, 'r') as zip_ref:
+            zip_ref.extractall(unzip_path)

src/cnnClassifier/components/model_evaluation_mlflow.py ADDED Viewed

	@@ -0,0 +1,63 @@

+import os
+import tensorflow as tf
+from pathlib import Path
+import dagshub
+import mlflow
+import mlflow.tensorflow
+from urllib.parse import urlparse
+from cnnClassifier.entity.config_entity import EvaluationConfig
+from cnnClassifier.utils.common import save_json
+class Evaluation:
+    def __init__(self, config: EvaluationConfig):
+        self.config = config
+    def _valid_generator(self):
+        datagenerator_kwargs = dict(rescale=1.0 / 255, validation_split=0.30)
+        dataflow_kwargs = dict(
+            target_size=self.config.params_image_size[:-1],
+            batch_size=self.config.params_batch_size,
+            interpolation="bilinear"
+        )
+        valid_datagenerator = tf.keras.preprocessing.image.ImageDataGenerator(
+            **datagenerator_kwargs
+        )
+        self.valid_generator = valid_datagenerator.flow_from_directory(
+            directory=self.config.training_data,
+            subset="validation",
+            shuffle=False,
+            **dataflow_kwargs
+        )
+    @staticmethod
+    def load_model(path: Path) -> tf.keras.Model:
+        return tf.keras.models.load_model(path)
+    def evaluation(self):
+        self.model = self.load_model(self.config.path_of_model)
+        self._valid_generator()
+        self.score = self.model.evaluate(self.valid_generator)
+        self.save_score()
+    def save_score(self):
+        scores = {"loss": self.score[0], "accuracy": self.score[1]}
+        save_json(path=Path("scores.json"), data=scores)
+    def log_into_mlflow(self):
+        dagshub.init(
+            repo_owner="sentongo-web",
+            repo_name="Kidney_classification_Using_MLOPS_and_DVC_Data-version-control",
+            mlflow=True
+        )
+        mlflow.set_registry_uri(self.config.mlflow_uri)
+        tracking_url_type_store = urlparse(mlflow.get_tracking_uri()).scheme
+        with mlflow.start_run():
+            mlflow.log_params(self.config.all_params)
+            mlflow.log_metrics({"loss": self.score[0], "accuracy": self.score[1]})
+            if tracking_url_type_store != "file":
+                mlflow.tensorflow.log_model(self.model, "model", registered_model_name="VGG16Model")
+            else:
+                mlflow.tensorflow.log_model(self.model, "model")

src/cnnClassifier/components/model_trainer.py ADDED Viewed

	@@ -0,0 +1,77 @@

+import os
+import tensorflow as tf
+from pathlib import Path
+from cnnClassifier.entity.config_entity import TrainingConfig
+class Training:
+    def __init__(self, config: TrainingConfig):
+        self.config = config
+    def get_base_model(self):
+        self.model = tf.keras.models.load_model(
+            self.config.updated_base_model_path, compile=False
+        )
+        self.model.compile(
+            optimizer=tf.keras.optimizers.SGD(learning_rate=self.config.params_learning_rate),
+            loss=tf.keras.losses.CategoricalCrossentropy(),
+            metrics=["accuracy"]
+        )
+    def train_valid_generator(self):
+        datagenerator_kwargs = dict(rescale=1.0 / 255, validation_split=0.20)
+        dataflow_kwargs = dict(
+            target_size=self.config.params_image_size[:-1],
+            batch_size=self.config.params_batch_size,
+            interpolation="bilinear"
+        )
+        valid_datagenerator = tf.keras.preprocessing.image.ImageDataGenerator(
+            **datagenerator_kwargs
+        )
+        self.valid_generator = valid_datagenerator.flow_from_directory(
+            directory=self.config.training_data,
+            subset="validation",
+            shuffle=False,
+            **dataflow_kwargs
+        )
+        if self.config.params_is_augmentation:
+            train_datagenerator = tf.keras.preprocessing.image.ImageDataGenerator(
+                rotation_range=40,
+                horizontal_flip=True,
+                width_shift_range=0.2,
+                height_shift_range=0.2,
+                shear_range=0.2,
+                zoom_range=0.2,
+                **datagenerator_kwargs
+            )
+        else:
+            train_datagenerator = valid_datagenerator
+        self.train_generator = train_datagenerator.flow_from_directory(
+            directory=self.config.training_data,
+            subset="training",
+            shuffle=True,
+            **dataflow_kwargs
+        )
+    @staticmethod
+    def save_model(path: Path, model: tf.keras.Model):
+        model.save(path)
+    def train(self):
+        self.steps_per_epoch = self.train_generator.samples // self.train_generator.batch_size
+        self.validation_steps = self.valid_generator.samples // self.valid_generator.batch_size
+        self.model.fit(
+            self.train_generator,
+            epochs=self.config.params_epochs,
+            steps_per_epoch=self.steps_per_epoch,
+            validation_steps=self.validation_steps,
+            validation_data=self.valid_generator
+        )
+        self.save_model(path=self.config.trained_model_path, model=self.model)

src/cnnClassifier/components/prepare_base_model.py ADDED Viewed

	@@ -0,0 +1,59 @@

+import os
+import urllib.request as request
+from pathlib import Path
+import tensorflow as tf
+from cnnClassifier.entity.config_entity import PrepareBaseModelConfig
+class PrepareBaseModel:
+    def __init__(self, config: PrepareBaseModelConfig):
+        self.config = config
+    def get_base_model(self):
+        self.model = tf.keras.applications.VGG16(
+            input_shape=self.config.params_image_size,
+            weights=self.config.params_weights,
+            include_top=self.config.params_include_top
+        )
+        self.save_model(path=self.config.base_model_path, model=self.model)
+    @staticmethod
+    def _prepare_full_model(model, classes, freeze_all, freeze_till, learning_rate):
+        if freeze_all:
+            for layer in model.layers:
+                layer.trainable = False
+        elif freeze_till is not None and freeze_till > 0:
+            for layer in model.layers[:-freeze_till]:
+                layer.trainable = False
+        flatten_in = tf.keras.layers.Flatten()(model.output)
+        prediction = tf.keras.layers.Dense(
+            units=classes,
+            activation="softmax"
+        )(flatten_in)
+        full_model = tf.keras.models.Model(
+            inputs=model.input,
+            outputs=prediction
+        )
+        full_model.compile(
+            optimizer=tf.keras.optimizers.SGD(learning_rate=learning_rate),
+            loss=tf.keras.losses.CategoricalCrossentropy(),
+            metrics=["accuracy"]
+        )
+        full_model.summary()
+        return full_model
+    def update_base_model(self):
+        self.full_model = self._prepare_full_model(
+            model=self.model,
+            classes=self.config.params_classes,
+            freeze_all=True,
+            freeze_till=None,
+            learning_rate=self.config.params_learning_rate
+        )
+        self.save_model(path=self.config.updated_base_model_path, model=self.full_model)
+    @staticmethod
+    def save_model(path: Path, model: tf.keras.Model):
+        model.save(path)

src/cnnClassifier/config/__init__.py ADDED Viewed

File without changes

src/cnnClassifier/config/configuration.py ADDED Viewed

	@@ -0,0 +1,67 @@

+from cnnClassifier.constants import CONFIG_FILE_PATH, PARAMS_FILE_PATH
+from cnnClassifier.utils.common import read_yaml, create_directories
+from cnnClassifier.entity.config_entity import DataIngestionConfig, PrepareBaseModelConfig, TrainingConfig, EvaluationConfig
+from pathlib import Path
+class ConfigurationManager:
+    def __init__(
+        self,
+        config_filepath=CONFIG_FILE_PATH,
+        params_filepath=PARAMS_FILE_PATH
+    ):
+        self.config = read_yaml(config_filepath)
+        self.params = read_yaml(params_filepath)
+        create_directories([self.config.artifacts_root])
+    def get_data_ingestion_config(self) -> DataIngestionConfig:
+        config = self.config.data_ingestion
+        create_directories([config.root_dir])
+        return DataIngestionConfig(
+            root_dir=config.root_dir,
+            source_URL=config.source_URL,
+            local_data_file=config.local_data_file,
+            unzip_dir=config.unzip_dir
+        )
+    def get_prepare_base_model_config(self) -> PrepareBaseModelConfig:
+        config = self.config.prepare_base_model
+        create_directories([config.root_dir])
+        return PrepareBaseModelConfig(
+            root_dir=config.root_dir,
+            base_model_path=config.base_model_path,
+            updated_base_model_path=config.updated_base_model_path,
+            params_image_size=self.params.IMAGE_SIZE,
+            params_learning_rate=self.params.LEARNING_RATE,
+            params_include_top=self.params.INCLUDE_TOP,
+            params_weights=self.params.WEIGHTS,
+            params_classes=self.params.CLASSES
+        )
+    def get_training_config(self) -> TrainingConfig:
+        training = self.config.training
+        prepare_base_model = self.config.prepare_base_model
+        training_data = Path(self.config.data_ingestion.unzip_dir) / "kidney-ct-scan-image"
+        create_directories([training.root_dir])
+        return TrainingConfig(
+            root_dir=training.root_dir,
+            trained_model_path=training.trained_model_path,
+            updated_base_model_path=prepare_base_model.updated_base_model_path,
+            training_data=training_data,
+            params_epochs=self.params.EPOCHS,
+            params_batch_size=self.params.BATCH_SIZE,
+            params_is_augmentation=self.params.AUGMENTATION,
+            params_image_size=self.params.IMAGE_SIZE,
+            params_learning_rate=self.params.LEARNING_RATE,
+        )
+    def get_evaluation_config(self) -> EvaluationConfig:
+        config = self.config.evaluation
+        return EvaluationConfig(
+            path_of_model=config.path_of_model,
+            training_data=config.training_data,
+            all_params=dict(config.all_params),
+            mlflow_uri=config.mlflow_uri,
+            params_image_size=self.params.IMAGE_SIZE,
+            params_batch_size=self.params.BATCH_SIZE,
+        )

src/cnnClassifier/constants/__init__.py ADDED Viewed

	@@ -0,0 +1,4 @@

+from pathlib import Path
+CONFIG_FILE_PATH: Path = Path("config/config.yaml")
+PARAMS_FILE_PATH: Path = Path("params.yaml")

src/cnnClassifier/entity/__init__.py ADDED Viewed

File without changes

src/cnnClassifier/entity/config_entity.py ADDED Viewed

	@@ -0,0 +1,45 @@

+from dataclasses import dataclass
+from pathlib import Path
+@dataclass(frozen=True)
+class DataIngestionConfig:
+    root_dir: Path
+    source_URL: str
+    local_data_file: Path
+    unzip_dir: Path
+@dataclass(frozen=True)
+class PrepareBaseModelConfig:
+    root_dir: Path
+    base_model_path: Path
+    updated_base_model_path: Path
+    params_image_size: list
+    params_learning_rate: float
+    params_include_top: bool
+    params_weights: str
+    params_classes: int
+@dataclass(frozen=True)
+class TrainingConfig:
+    root_dir: Path
+    trained_model_path: Path
+    updated_base_model_path: Path
+    training_data: Path
+    params_epochs: int
+    params_batch_size: int
+    params_is_augmentation: bool
+    params_image_size: list
+    params_learning_rate: float
+@dataclass(frozen=True)
+class EvaluationConfig:
+    path_of_model: Path
+    training_data: Path
+    all_params: dict
+    mlflow_uri: str
+    params_image_size: list
+    params_batch_size: int

src/cnnClassifier/pipeline/__init__.py ADDED Viewed

File without changes

src/cnnClassifier/pipeline/prediction.py ADDED Viewed

	@@ -0,0 +1,25 @@

+import numpy as np
+from tensorflow.keras.models import load_model
+from tensorflow.keras.preprocessing import image
+import os
+class PredictionPipeline:
+    def __init__(self, filename):
+        self.filename = filename
+    def predict(self):
+        model = load_model(os.path.join("artifacts", "training", "model.h5"))
+        img = image.load_img(self.filename, target_size=(224, 224))
+        img_array = image.img_to_array(img)
+        img_array = np.expand_dims(img_array, axis=0) / 255.0
+        result = np.argmax(model.predict(img_array), axis=1)
+        if result[0] == 1:
+            prediction = "Tumor"
+        else:
+            prediction = "Normal"
+        return [{"image": prediction}]

src/cnnClassifier/pipeline/stage_01_data_ingestion.py ADDED Viewed

	@@ -0,0 +1,28 @@

+from cnnClassifier import logger
+from cnnClassifier.config.configuration import ConfigurationManager
+from cnnClassifier.components.data_ingestion import DataIngestion
+STAGE_NAME = "Data Ingestion stage"
+class DataIngestionTrainingPipeline:
+    def __init__(self):
+        pass
+    def main(self):
+        config = ConfigurationManager()
+        data_ingestion_config = config.get_data_ingestion_config()
+        data_ingestion = DataIngestion(config=data_ingestion_config)
+        data_ingestion.download_file()
+        data_ingestion.extract_zip_file()
+if __name__ == '__main__':
+    try:
+        logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
+        obj = DataIngestionTrainingPipeline()
+        obj.main()
+        logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<\n\nx==========x")
+    except Exception as e:
+        logger.exception(e)
+        raise e

src/cnnClassifier/pipeline/stage_02_prepare_base_model.py ADDED Viewed

	@@ -0,0 +1,28 @@

+from cnnClassifier import logger
+from cnnClassifier.config.configuration import ConfigurationManager
+from cnnClassifier.components.prepare_base_model import PrepareBaseModel
+STAGE_NAME = "Prepare Base Model stage"
+class PrepareBaseModelTrainingPipeline:
+    def __init__(self):
+        pass
+    def main(self):
+        config = ConfigurationManager()
+        prepare_base_model_config = config.get_prepare_base_model_config()
+        prepare_base_model = PrepareBaseModel(config=prepare_base_model_config)
+        prepare_base_model.get_base_model()
+        prepare_base_model.update_base_model()
+if __name__ == '__main__':
+    try:
+        logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
+        obj = PrepareBaseModelTrainingPipeline()
+        obj.main()
+        logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<\n\nx==========x")
+    except Exception as e:
+        logger.exception(e)
+        raise e

src/cnnClassifier/pipeline/stage_03_model_trainer.py ADDED Viewed

	@@ -0,0 +1,29 @@

+from cnnClassifier import logger
+from cnnClassifier.config.configuration import ConfigurationManager
+from cnnClassifier.components.model_trainer import Training
+STAGE_NAME = "Training stage"
+class ModelTrainerTrainingPipeline:
+    def __init__(self):
+        pass
+    def main(self):
+        config = ConfigurationManager()
+        training_config = config.get_training_config()
+        training = Training(config=training_config)
+        training.get_base_model()
+        training.train_valid_generator()
+        training.train()
+if __name__ == '__main__':
+    try:
+        logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
+        obj = ModelTrainerTrainingPipeline()
+        obj.main()
+        logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<\n\nx==========x")
+    except Exception as e:
+        logger.exception(e)
+        raise e

src/cnnClassifier/pipeline/stage_04_model_evaluation.py ADDED Viewed

	@@ -0,0 +1,30 @@

+from dotenv import load_dotenv
+load_dotenv()
+from cnnClassifier import logger
+from cnnClassifier.config.configuration import ConfigurationManager
+from cnnClassifier.components.model_evaluation_mlflow import Evaluation
+STAGE_NAME = "Model Evaluation stage"
+class ModelEvaluationPipeline:
+    def __init__(self):
+        pass
+    def main(self):
+        config = ConfigurationManager()
+        eval_config = config.get_evaluation_config()
+        evaluation = Evaluation(config=eval_config)
+        evaluation.evaluation()
+        evaluation.log_into_mlflow()
+if __name__ == '__main__':
+    try:
+        logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
+        obj = ModelEvaluationPipeline()
+        obj.main()
+        logger.info(f">>>>>> stage {STAGE_NAME} completed <<<<<<\n\nx==========x")
+    except Exception as e:
+        logger.exception(e)
+        raise e

src/cnnClassifier/utils/__init__.py ADDED Viewed

File without changes

src/cnnClassifier/utils/common.py ADDED Viewed

	@@ -0,0 +1,148 @@

+import os
+import json
+import base64
+import joblib  # type: ignore[import-untyped]
+import yaml
+from pathlib import Path
+from typing import Any, cast
+from box import ConfigBox  # type: ignore[import-untyped]
+from box.exceptions import BoxValueError  # type: ignore[import-untyped]
+from ensure import ensure_annotations  # type: ignore[import-untyped]
+from cnnClassifier import logger
+@ensure_annotations
+def read_yaml(path_to_yaml: Path) -> ConfigBox:
+    """Reads a YAML file and returns its content as a ConfigBox.
+    Args:
+        path_to_yaml (Path): Path to the YAML file.
+    Raises:
+        ValueError: If the YAML file is empty.
+        BoxValueError: If the YAML content is invalid.
+    Returns:
+        ConfigBox: Parsed YAML content with dot-access support.
+    """
+    try:
+        with open(path_to_yaml) as yaml_file:
+            content = yaml.safe_load(yaml_file)
+            if content is None:
+                raise ValueError(f"YAML file is empty: {path_to_yaml}")
+            logger.info(f"YAML file loaded successfully: {path_to_yaml}")
+            return ConfigBox(content)
+    except BoxValueError as e:
+        raise BoxValueError(f"Invalid YAML content in {path_to_yaml}: {e}")
+def create_directories(path_to_directories: list[Path], verbose: bool = True) -> None:
+    """Creates a list of directories if they do not already exist.
+    Args:
+        path_to_directories (list[Path]): List of directory paths to create.
+        verbose (bool): Whether to log each created directory. Defaults to True.
+    """
+    for path in path_to_directories:
+        os.makedirs(str(path), exist_ok=True)
+        if verbose:
+            logger.info(f"Created directory: {path}")
+def save_json(path: Path, data: dict[str, Any]) -> None:
+    """Saves a dictionary as a JSON file.
+    Args:
+        path (Path): Path where the JSON file will be saved.
+        data (dict[str, Any]): Dictionary to save.
+    """
+    with open(path, "w") as f:
+        json.dump(data, f, indent=4)
+    logger.info(f"JSON saved to: {path}")
+@ensure_annotations
+def load_json(path: Path) -> ConfigBox:
+    """Loads a JSON file and returns its content as a ConfigBox.
+    Args:
+        path (Path): Path to the JSON file.
+    Returns:
+        ConfigBox: JSON content with dot-access support.
+    """
+    with open(path) as f:
+        content = json.load(f)
+    logger.info(f"JSON loaded from: {path}")
+    return ConfigBox(content)
+@ensure_annotations
+def save_bin(data: Any, path: Path) -> None:
+    """Saves any Python object as a binary file using joblib.
+    Args:
+        data (Any): Object to serialize (e.g. model, scaler).
+        path (Path): Destination path for the binary file.
+    """
+    joblib.dump(value=data, filename=path)  # type: ignore[no-untyped-call]
+    logger.info(f"Binary file saved to: {path}")
+@ensure_annotations
+def load_bin(path: Path) -> Any:
+    """Loads a binary file saved with joblib.
+    Args:
+        path (Path): Path to the binary file.
+    Returns:
+        Any: The deserialized Python object.
+    """
+    data: Any = cast(Any, joblib.load(path))  # type: ignore[no-untyped-call]
+    logger.info(f"Binary file loaded from: {path}")
+    return data
+@ensure_annotations
+def get_size(path: Path) -> str:
+    """Returns the size of a file in kilobytes (KB).
+    Args:
+        path (Path): Path to the file.
+    Returns:
+        str: File size as a human-readable string, e.g. "~ 24 KB".
+    """
+    size_in_kb = round(os.path.getsize(path) / 1024)
+    return f"~ {size_in_kb} KB"
+def decode_image(imgstring: str, file_name: str) -> None:
+    """Decodes a base64-encoded image string and writes it to a file.
+    Used by the Flask prediction endpoint to receive images via API.
+    Args:
+        imgstring (str): Base64-encoded image string.
+        file_name (str): Destination file path to write the decoded image.
+    """
+    imgdata = base64.b64decode(imgstring)
+    with open(file_name, "wb") as f:
+        f.write(imgdata)
+    logger.info(f"Image decoded and saved to: {file_name}")
+def encode_image_into_base64(image_path: str) -> str:
+    """Reads an image file and encodes it into a base64 string.
+    Used to return prediction results as base64 over the API.
+    Args:
+        image_path (str): Path to the image file.
+    Returns:
+        str: Base64-encoded string of the image.
+    """
+    with open(image_path, "rb") as f:
+        encoded = base64.b64encode(f.read()).decode("utf-8")
+    logger.info(f"Image encoded to base64 from: {image_path}")
+    return encoded

template.py ADDED Viewed

	@@ -0,0 +1,38 @@

+import os
+from pathlib import Path
+import logging
+#logging string
+logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
+project_name = "cnnClassifier"
+list_of_files = [
+    ".github/workflows/.gitkeep",
+    f"src/{project_name}/__init__.py",
+    f"src/{project_name}/components/__init__.py",
+    f"src/{project_name}/utils/__init__.py",
+    f"src/{project_name}/config/__init__.py",
+    f"src/{project_name}/config/configuration.py",
+    f"src/{project_name}/pipeline/__init__.py",
+    f"src/{project_name}/entity/__init__.py",
+    f"src/{project_name}/constants/__init__.py",
+    "config/config.yaml",
+    "dvc.yaml",
+    "params.yaml",
+    "requirements.txt",
+    "setup.py",
+    "research/trials.ipynb",
+    "templates/index.html"
+]
+for filepath in list_of_files:
+    filepath = Path(filepath)
+    filedir, filename = os.path.split(filepath)
+    if filedir != "":
+        os.makedirs(filedir, exist_ok=True)
+        logging.info(f"Creating directory: {filedir} for file: {filename}")
+    if not os.path.exists(filepath) or os.path.getsize(filepath) == 0:
+        with open(filepath, "w") as f:
+            pass
+        logging.info(f"Creating empty file: {filepath}")
+    else:
+        logging.info(f"File already exists and is not empty: {filepath}")

templates/index.html ADDED Viewed

	@@ -0,0 +1,728 @@

+<!DOCTYPE html>
+<html lang="en" data-theme="light">
+<head>
+  <meta charset="UTF-8" />
+  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+  <title>KidneyDL CT Scan Classifier</title>
+  <link rel="preconnect" href="https://fonts.googleapis.com" />
+  <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700;800&display=swap" rel="stylesheet" />
+  <style>
+    /* ── Theme tokens ─────────────────────────────────────────── */
+    :root {
+      --bg:           #f0f5ff;
+      --surface:      #ffffff;
+      --surface-alt:  #f8fafc;
+      --border:       #e2e8f0;
+      --text:         #0f172a;
+      --text-muted:   #64748b;
+      --accent:       #3b82f6;
+      --accent-dark:  #2563eb;
+      --accent-glow:  rgba(59,130,246,0.15);
+      --success:      #10b981;
+      --success-bg:   #ecfdf5;
+      --success-bdr:  #6ee7b7;
+      --danger:       #ef4444;
+      --danger-bg:    #fef2f2;
+      --danger-bdr:   #fca5a5;
+      --shadow:       0 4px 32px rgba(15,23,42,0.08);
+      --shadow-lg:    0 8px 48px rgba(15,23,42,0.14);
+      --radius:       18px;
+      --radius-sm:    12px;
+      --ease:         0.25s ease;
+    }
+    [data-theme="dark"] {
+      --bg:           #080f1e;
+      --surface:      #111827;
+      --surface-alt:  #1a2338;
+      --border:       #1e2d45;
+      --text:         #e2e8f0;
+      --text-muted:   #94a3b8;
+      --accent:       #60a5fa;
+      --accent-dark:  #3b82f6;
+      --accent-glow:  rgba(96,165,250,0.14);
+      --success:      #34d399;
+      --success-bg:   #022c22;
+      --success-bdr:  #065f46;
+      --danger:       #f87171;
+      --danger-bg:    #2d0a0a;
+      --danger-bdr:   #7f1d1d;
+      --shadow:       0 4px 32px rgba(0,0,0,0.45);
+      --shadow-lg:    0 8px 48px rgba(0,0,0,0.6);
+    }
+    /* ── Reset ────────────────────────────────────────────────── */
+    *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
+    html { scroll-behavior: smooth; }
+    body {
+      font-family: 'Inter', system-ui, sans-serif;
+      background: var(--bg);
+      color: var(--text);
+      min-height: 100vh;
+      transition: background var(--ease), color var(--ease);
+      line-height: 1.65;
+    }
+    a { color: var(--accent); text-decoration: none; transition: opacity 0.2s; }
+    a:hover { opacity: 0.75; }
+    /* ── Top bar ──────────────────────────────────────────────── */
+    .topbar {
+      position: sticky; top: 0; z-index: 100;
+      display: flex; align-items: center; justify-content: space-between;
+      padding: 14px 32px;
+      background: var(--surface);
+      border-bottom: 1px solid var(--border);
+      box-shadow: var(--shadow);
+    }
+    .topbar-brand {
+      display: flex; align-items: center; gap: 10px;
+      font-size: 1.05rem; font-weight: 800; letter-spacing: -0.5px;
+      color: var(--text);
+    }
+    .pulse {
+      width: 9px; height: 9px; border-radius: 50%;
+      background: var(--accent);
+      animation: pulseRing 2.2s ease infinite;
+    }
+    @keyframes pulseRing {
+      0%, 100% { box-shadow: 0 0 0 0 var(--accent-glow); }
+      50%       { box-shadow: 0 0 0 8px rgba(0,0,0,0); }
+    }
+    .theme-btn {
+      display: flex; align-items: center; gap: 7px;
+      background: var(--surface-alt);
+      border: 1px solid var(--border);
+      border-radius: 999px;
+      padding: 6px 16px;
+      cursor: pointer;
+      font-family: inherit;
+      font-size: 0.78rem; font-weight: 600;
+      color: var(--text-muted);
+      transition: all var(--ease);
+    }
+    .theme-btn:hover { border-color: var(--accent); color: var(--text); }
+    .theme-btn svg { width: 14px; height: 14px; }
+    /* ── Page ─────────────────────────────────────────────────── */
+    .page { max-width: 960px; margin: 0 auto; padding: 52px 24px 88px; }
+    /* ── Hero ─────────────────────────────────────────────────── */
+    .hero { text-align: center; margin-bottom: 60px; }
+    .hero-badge {
+      display: inline-flex; align-items: center; gap: 7px;
+      background: var(--accent-glow);
+      border: 1px solid color-mix(in srgb, var(--accent) 50%, transparent);
+      color: var(--accent);
+      font-size: 0.7rem; font-weight: 700; letter-spacing: 0.1em;
+      text-transform: uppercase;
+      padding: 5px 16px; border-radius: 999px; margin-bottom: 22px;
+    }
+    .hero-badge .dot { width: 6px; height: 6px; border-radius: 50%; background: currentColor; }
+    .hero h1 {
+      font-size: clamp(2rem, 5.5vw, 3.2rem);
+      font-weight: 800; letter-spacing: -1.5px; line-height: 1.12;
+      margin-bottom: 18px;
+      background: linear-gradient(135deg, var(--text) 30%, var(--accent) 100%);
+      -webkit-background-clip: text; -webkit-text-fill-color: transparent;
+      background-clip: text;
+    }
+    .hero p {
+      font-size: 1.05rem; color: var(--text-muted);
+      max-width: 580px; margin: 0 auto; line-height: 1.75;
+    }
+    /* ── Cards ────────────────────────────────────────────────── */
+    .card {
+      background: var(--surface);
+      border: 1px solid var(--border);
+      border-radius: var(--radius);
+      box-shadow: var(--shadow);
+      padding: 36px;
+      transition: background var(--ease), border-color var(--ease);
+    }
+    /* ── Classifier layout ────────────────────────────────────── */
+    .classifier-grid {
+      display: grid; grid-template-columns: 1fr 1fr; gap: 24px;
+    }
+    @media (max-width: 620px) { .classifier-grid { grid-template-columns: 1fr; } }
+    .section-eyebrow {
+      font-size: 0.7rem; font-weight: 700; letter-spacing: 0.1em;
+      text-transform: uppercase; color: var(--text-muted); margin-bottom: 14px;
+    }
+    /* Drop zone */
+    .drop-zone {
+      border: 2px dashed var(--border);
+      border-radius: var(--radius-sm);
+      padding: 38px 20px; text-align: center; cursor: pointer;
+      background: var(--surface-alt);
+      transition: border-color var(--ease), background var(--ease), transform 0.15s;
+      user-select: none;
+    }
+    .drop-zone:hover, .drop-zone.over {
+      border-color: var(--accent); background: var(--accent-glow);
+      transform: translateY(-2px);
+    }
+    .drop-zone input { display: none; }
+    .dz-icon { font-size: 2.4rem; margin-bottom: 12px; }
+    .dz-hint { font-size: 0.86rem; color: var(--text-muted); line-height: 1.6; }
+    .dz-hint b { color: var(--accent); font-weight: 600; }
+    /* Preview */
+    .preview-box {
+      border-radius: var(--radius-sm);
+      overflow: hidden;
+      border: 1px solid var(--border);
+      background: var(--surface-alt);
+      min-height: 200px;
+      display: flex; align-items: center; justify-content: center;
+      position: relative;
+    }
+    .preview-box img {
+      width: 100%; height: 200px; object-fit: cover; display: none;
+    }
+    .preview-box img.show { display: block; }
+    .preview-empty {
+      display: flex; flex-direction: column; align-items: center;
+      gap: 10px; color: var(--text-muted); font-size: 0.82rem;
+    }
+    .preview-empty svg { width: 38px; height: 38px; opacity: 0.25; }
+    .preview-label {
+      position: absolute; bottom: 0; left: 0; right: 0;
+      padding: 6px 12px;
+      background: rgba(0,0,0,0.55);
+      color: #fff; font-size: 0.72rem;
+      white-space: nowrap; overflow: hidden; text-overflow: ellipsis;
+      display: none;
+    }
+    /* Buttons */
+    .btn-row { display: flex; gap: 12px; margin-top: 24px; }
+    .btn {
+      flex: 1; padding: 13px 18px;
+      border-radius: var(--radius-sm);
+      font-family: inherit; font-size: 0.88rem; font-weight: 600;
+      cursor: pointer; border: none;
+      display: flex; align-items: center; justify-content: center; gap: 7px;
+      transition: all var(--ease); position: relative; overflow: hidden;
+    }
+    .btn:disabled { opacity: 0.4; cursor: not-allowed; pointer-events: none; }
+    .btn:active   { transform: scale(0.97); }
+    .btn-primary {
+      background: linear-gradient(135deg, var(--accent), var(--accent-dark));
+      color: #fff;
+      box-shadow: 0 4px 18px var(--accent-glow);
+    }
+    .btn-primary:not(:disabled):hover {
+      box-shadow: 0 6px 24px var(--accent-glow);
+      transform: translateY(-1px);
+    }
+    .btn-ghost {
+      background: var(--surface-alt);
+      color: var(--text-muted);
+      border: 1px solid var(--border);
+    }
+    .btn-ghost:hover { border-color: var(--accent); color: var(--accent); }
+    /* Loading */
+    #loading {
+      display: none; align-items: center; justify-content: center;
+      gap: 12px; padding: 18px 0; color: var(--text-muted); font-size: 0.86rem;
+    }
+    .ring {
+      width: 22px; height: 22px; flex-shrink: 0;
+      border: 2.5px solid var(--border);
+      border-top-color: var(--accent);
+      border-radius: 50%;
+      animation: spin 0.7s linear infinite;
+    }
+    @keyframes spin { to { transform: rotate(360deg); } }
+    /* Result */
+    #result {
+      display: none; margin-top: 24px;
+      border-radius: var(--radius-sm); padding: 22px 24px;
+      animation: riseIn 0.35s cubic-bezier(0.34,1.56,0.64,1);
+    }
+    @keyframes riseIn {
+      from { opacity: 0; transform: translateY(12px) scale(0.98); }
+      to   { opacity: 1; transform: translateY(0) scale(1); }
+    }
+    #result.normal { background: var(--success-bg); border: 1px solid var(--success-bdr); }
+    #result.tumor  { background: var(--danger-bg);  border: 1px solid var(--danger-bdr); }
+    .res-row { display: flex; align-items: flex-start; gap: 14px; }
+    .res-ico { font-size: 1.9rem; flex-shrink: 0; line-height: 1; }
+    .res-title { font-size: 1.15rem; font-weight: 800; margin-bottom: 3px; }
+    #result.normal .res-title { color: var(--success); }
+    #result.tumor  .res-title { color: var(--danger); }
+    .res-sub { font-size: 0.82rem; color: var(--text-muted); line-height: 1.6; }
+    .conf-wrap { margin-top: 14px; }
+    .conf-meta { display: flex; justify-content: space-between;
+                 font-size: 0.72rem; color: var(--text-muted); margin-bottom: 5px; }
+    .conf-track { height: 5px; border-radius: 999px; background: var(--border); overflow: hidden; }
+    .conf-fill  { height: 100%; border-radius: 999px; transition: width 0.65s ease; }
+    #result.normal .conf-fill { background: var(--success); }
+    #result.tumor  .conf-fill { background: var(--danger); }
+    /* Disclaimer */
+    .disclaimer {
+      margin-top: 20px;
+      background: var(--surface-alt);
+      border: 1px solid var(--border);
+      border-left: 3px solid var(--accent);
+      border-radius: var(--radius-sm);
+      padding: 14px 18px;
+      font-size: 0.78rem; color: var(--text-muted); line-height: 1.65;
+    }
+    /* ── Section divider ──────────────────────────────────────── */
+    .divider {
+      display: flex; align-items: center; gap: 16px;
+      margin: 60px 0 36px;
+      font-size: 0.7rem; font-weight: 700; letter-spacing: 0.1em;
+      text-transform: uppercase; color: var(--text-muted);
+    }
+    .divider::before, .divider::after {
+      content: ''; flex: 1; height: 1px; background: var(--border);
+    }
+    /* ── Info grid ────────────────────────────────────────────── */
+    .info-grid { display: grid; grid-template-columns: 1fr 1fr; gap: 20px; }
+    @media (max-width: 600px) { .info-grid { grid-template-columns: 1fr; } }
+    .info-card {
+      background: var(--surface);
+      border: 1px solid var(--border);
+      border-radius: var(--radius);
+      padding: 28px 28px 30px;
+      transition: transform var(--ease), box-shadow var(--ease);
+    }
+    .info-card:hover { transform: translateY(-4px); box-shadow: var(--shadow-lg); }
+    .ico-wrap {
+      width: 44px; height: 44px; border-radius: 12px;
+      display: flex; align-items: center; justify-content: center;
+      font-size: 1.3rem; margin-bottom: 16px;
+    }
+    .ic-blue   { background: rgba(59,130,246,0.12); }
+    .ic-violet { background: rgba(139,92,246,0.12); }
+    .ic-teal   { background: rgba(20,184,166,0.12); }
+    .ic-amber  { background: rgba(245,158,11,0.12); }
+    .info-card h3 { font-size: 0.95rem; font-weight: 700; margin-bottom: 10px; }
+    .info-card p  { font-size: 0.82rem; color: var(--text-muted); line-height: 1.72; }
+    /* Tech badges */
+    .badges { display: flex; flex-wrap: wrap; gap: 9px; margin-top: 14px; }
+    .badge {
+      display: inline-flex; align-items: center; gap: 5px;
+      background: var(--surface-alt); border: 1px solid var(--border);
+      border-radius: 999px; padding: 5px 13px;
+      font-size: 0.74rem; font-weight: 600; color: var(--text-muted);
+      transition: all var(--ease);
+    }
+    .badge:hover { border-color: var(--accent); color: var(--accent); background: var(--accent-glow); }
+    /* ── Author ───────────────────────────────────────────────── */
+    .author {
+      background: var(--surface);
+      border: 1px solid var(--border);
+      border-radius: var(--radius);
+      padding: 38px;
+      display: flex; gap: 30px; align-items: flex-start;
+      box-shadow: var(--shadow);
+    }
+    @media (max-width: 600px) { .author { flex-direction: column; } }
+    .avatar {
+      flex-shrink: 0;
+      width: 90px; height: 90px; border-radius: 50%;
+      background: linear-gradient(135deg, #3b82f6, #8b5cf6);
+      display: flex; align-items: center; justify-content: center;
+      font-size: 2rem; font-weight: 800; color: #fff;
+      box-shadow: 0 6px 24px rgba(59,130,246,0.3);
+      letter-spacing: -1px;
+    }
+    .author-name  { font-size: 1.3rem; font-weight: 800; letter-spacing: -0.4px; margin-bottom: 4px; }
+    .author-title { font-size: 0.8rem; color: var(--accent); font-weight: 600; margin-bottom: 14px; }
+    .author-bio   { font-size: 0.85rem; color: var(--text-muted); line-height: 1.78; margin-bottom: 20px; }
+    .author-links { display: flex; gap: 10px; flex-wrap: wrap; }
+    .social-btn {
+      display: inline-flex; align-items: center; gap: 7px;
+      border: 1px solid var(--border); border-radius: 999px;
+      padding: 8px 18px; font-family: inherit;
+      font-size: 0.78rem; font-weight: 600;
+      color: var(--text-muted); background: var(--surface-alt);
+      cursor: pointer; transition: all var(--ease);
+      text-decoration: none;
+    }
+    .social-btn svg { width: 15px; height: 15px; }
+    .social-btn:hover { border-color: var(--accent); color: var(--accent); background: var(--accent-glow); opacity: 1; }
+    /* ── Footer ───────────────────────────────────────────────── */
+    .footer {
+      text-align: center; margin-top: 68px;
+      padding-top: 28px; border-top: 1px solid var(--border);
+      font-size: 0.78rem; color: var(--text-muted);
+    }
+    .footer strong { color: var(--text); }
+  </style>
+</head>
+<body>
+  <nav class="topbar">
+    <div class="topbar-brand">
+      <div class="pulse"></div>
+      KidneyDL
+    </div>
+    <button class="theme-btn" id="themeBtn" onclick="toggleTheme()">
+      <svg id="themeIco" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
+        <circle cx="12" cy="12" r="5"/>
+        <line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/>
+        <line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/>
+        <line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/>
+        <line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/>
+      </svg>
+      <span id="themeLabel">Light mode</span>
+    </button>
+  </nav>
+  <div class="page">
+    <!-- Hero -->
+    <div class="hero">
+      <div class="hero-badge"><div class="dot"></div> AI Powered Medical Imaging</div>
+      <h1>Kidney CT Scan<br/>Tumor Classifier</h1>
+      <p>
+        A deep learning system built to help detect kidney tumors from CT scan images.
+        Upload a scan and the model will tell you within seconds whether the kidney
+        appears normal or shows signs of a tumor. Built with transfer learning,
+        full experiment tracking, and a reproducible MLOps pipeline.
+      </p>
+    </div>
+    <!-- Classifier -->
+    <div class="card">
+      <div class="section-eyebrow">Upload a CT Scan Image</div>
+      <div class="classifier-grid">
+        <div>
+          <div class="drop-zone" id="dropZone" onclick="document.getElementById('fileInput').click()">
+            <div class="dz-icon">&#x1FAC1;</div>
+            <p class="dz-hint">
+              Drop your CT scan image here<br/>
+              or <b>click to choose a file</b>
+            </p>
+            <input type="file" id="fileInput" accept="image/*" />
+          </div>
+        </div>
+        <div class="preview-box" id="previewBox">
+          <div class="preview-empty" id="previewEmpty">
+            <svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.2">
+              <rect x="3" y="3" width="18" height="18" rx="2"/>
+              <circle cx="8.5" cy="8.5" r="1.5"/>
+              <polyline points="21 15 16 10 5 21"/>
+            </svg>
+            <span>Scan preview will appear here</span>
+          </div>
+          <img id="previewImg" alt="CT scan preview" />
+          <div class="preview-label" id="previewLabel"></div>
+        </div>
+      </div>
+      <div class="btn-row">
+        <button class="btn btn-primary" id="predictBtn" onclick="predict()" disabled>
+          <svg width="15" height="15" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round">
+            <circle cx="11" cy="11" r="8"/><path d="m21 21-4.35-4.35"/>
+          </svg>
+          Analyse Scan
+        </button>
+        <button class="btn btn-ghost" id="trainBtn" onclick="trainModel()">
+          <svg width="15" height="15" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
+            <polyline points="23 4 23 10 17 10"/>
+            <path d="M20.49 15a9 9 0 1 1-2.12-9.36L23 10"/>
+          </svg>
+          Retrain
+        </button>
+      </div>
+      <div id="loading">
+        <div class="ring"></div>
+        <span>Analysing your scan with AI, please wait...</span>
+      </div>
+      <div id="result">
+        <div class="res-row">
+          <div class="res-ico" id="resIco"></div>
+          <div>
+            <div class="res-title" id="resTitle"></div>
+            <div class="res-sub"   id="resSub"></div>
+          </div>
+        </div>
+        <div class="conf-wrap">
+          <div class="conf-meta">
+            <span>Model Confidence</span>
+            <span id="confPct"></span>
+          </div>
+          <div class="conf-track">
+            <div class="conf-fill" id="confFill" style="width:0%"></div>
+          </div>
+        </div>
+      </div>
+      <div class="disclaimer">
+        <strong>Important notice:</strong> This tool is intended for research and educational use only.
+        It is not a certified medical device and should never replace the judgement of a qualified
+        radiologist or physician. Please seek professional medical advice for any health concerns.
+      </div>
+    </div>
+    <!-- About the project -->
+    <div class="divider">About the Project</div>
+    <div class="info-grid">
+      <div class="info-card">
+        <div class="ico-wrap ic-blue">&#x1F9E0;</div>
+        <h3>Why VGG16?</h3>
+        <p>
+          VGG16 was chosen because its deep stack of simple 3x3 convolution layers is
+          remarkably good at learning fine-grained textures, which is exactly what you need
+          when distinguishing healthy renal tissue from abnormal cell growth in a CT scan.
+          Pre-trained on ImageNet, its weights already encode a rich understanding of edges,
+          shapes, and spatial patterns, making it an ideal starting point for medical imaging
+          tasks where labelled data is limited.
+        </p>
+      </div>
+      <div class="info-card">
+        <div class="ico-wrap ic-violet">&#x1F4CA;</div>
+        <h3>How the Model Was Built</h3>
+        <p>
+          The training process used transfer learning. The VGG16 base layers were frozen
+          to preserve the knowledge captured from ImageNet, and a custom classification
+          head was added and fine-tuned on kidney CT scan images split 70 percent for
+          training and 30 percent for validation. Every experiment was tracked end to end
+          with MLflow on DagsHub, capturing parameters, metrics, and model artifacts for
+          full auditability and comparison across runs.
+        </p>
+      </div>
+      <div class="info-card">
+        <div class="ico-wrap ic-teal">&#x2699;&#xFE0F;</div>
+        <h3>MLOps Pipeline</h3>
+        <p>
+          The project is structured around four fully automated DVC pipeline stages:
+          data ingestion, base model preparation, training, and evaluation.
+          Each stage is versioned independently so that only what has changed is
+          re-executed on the next run. Model metrics are pushed automatically to the
+          MLflow registry, enabling side-by-side comparison of runs and straightforward
+          model promotion to production.
+        </p>
+      </div>
+      <div class="info-card">
+        <div class="ico-wrap ic-amber">&#x1F9F0;</div>
+        <h3>Tech Stack</h3>
+        <p>Built with tools that are standard in modern ML engineering teams.</p>
+        <div class="badges">
+          <span class="badge">&#x1F40D; Python 3.13</span>
+          <span class="badge">&#x1F9EE; TensorFlow and Keras</span>
+          <span class="badge">&#x1F4C8; MLflow</span>
+          <span class="badge">&#x1F4BE; DVC</span>
+          <span class="badge">&#x1F30A; DagsHub</span>
+          <span class="badge">&#x1F6E0;&#xFE0F; Flask</span>
+          <span class="badge">&#x1F433; Docker</span>
+          <span class="badge">&#x1F4F8; VGG16</span>
+        </div>
+      </div>
+    </div>
+    <!-- Author -->
+    <div class="divider">About the Author</div>
+    <div class="author">
+      <div class="avatar">PS</div>
+      <div>
+        <div class="author-name">Paul Sentongo</div>
+        <div class="author-title">Data Science Researcher &nbsp;|&nbsp; MSc Data Science &nbsp;|&nbsp; Open to New Opportunities</div>
+        <p class="author-bio">
+          Paul is a data scientist and applied AI researcher with a Master's degree in Data Science,
+          driven by a genuine curiosity about how machine learning can be applied to problems that
+          actually matter in healthcare, sustainability, and social impact.
+          <br/><br/>
+          His work sits at the intersection of deep learning, computer vision, and production-ready
+          MLOps infrastructure. He brings both the academic rigour to understand what is happening
+          under the hood of a model and the engineering discipline to build systems that work
+          reliably in the real world. This project is one example of that thinking: not just
+          training a model, but building the entire scaffold around it so that experiments are
+          reproducible, results are traceable, and the system can be handed off to anyone and
+          still run cleanly.
+          <br/><br/>
+          Paul is currently looking for research or industry roles where he can contribute to
+          meaningful AI work, grow alongside talented teams, and keep building things worth building.
+        </p>
+        <div class="author-links">
+          <a class="social-btn" href="https://github.com/sentongo-web" target="_blank" rel="noopener">
+            <svg viewBox="0 0 24 24" fill="currentColor">
+              <path d="M12 2C6.477 2 2 6.484 2 12.017c0 4.425 2.865 8.18 6.839 9.504.5.092.682-.217.682-.483
+              0-.237-.008-.868-.013-1.703-2.782.605-3.369-1.343-3.369-1.343-.454-1.158-1.11-1.466-1.11-1.466
+              -.908-.62.069-.608.069-.608 1.003.07 1.531 1.032 1.531 1.032.892 1.53 2.341 1.088 2.91.832
+              .092-.647.35-1.088.636-1.338-2.22-.253-4.555-1.113-4.555-4.951 0-1.093.39-1.988 1.029-2.688
+              -.103-.253-.446-1.272.098-2.65 0 0 .84-.27 2.75 1.026A9.564 9.564 0 0 1 12 6.844
+              a9.59 9.59 0 0 1 2.504.337c1.909-1.296 2.747-1.027 2.747-1.027.546 1.379.202 2.398.1 2.651
+              .64.7 1.028 1.595 1.028 2.688 0 3.848-2.339 4.695-4.566 4.943.359.309.678.92.678 1.855
+              0 1.338-.012 2.419-.012 2.747 0 .268.18.58.688.482A10.02 10.02 0 0 0 22 12.017
+              C22 6.484 17.522 2 12 2z"/>
+            </svg>
+            GitHub
+          </a>
+          <a class="social-btn" href="https://www.linkedin.com/in/paul-sentongo-885041284/" target="_blank" rel="noopener">
+            <svg viewBox="0 0 24 24" fill="currentColor">
+              <path d="M20.447 20.452h-3.554v-5.569c0-1.328-.027-3.037-1.852-3.037-1.853 0-2.136
+              1.445-2.136 2.939v5.667H9.351V9h3.414v1.561h.046c.477-.9 1.637-1.85 3.37-1.85
+              3.601 0 4.267 2.37 4.267 5.455v6.286zM5.337 7.433a2.062 2.062 0 0 1-2.063-2.065
+              2.064 2.064 0 1 1 2.063 2.065zm1.782 13.019H3.555V9h3.564v11.452zM22.225 0H1.771
+              C.792 0 0 .774 0 1.729v20.542C0 23.227.792 24 1.771 24h20.451C23.2 24 24 23.227
+              24 22.271V1.729C24 .774 23.2 0 22.222 0h.003z"/>
+            </svg>
+            LinkedIn
+          </a>
+        </div>
+      </div>
+    </div>
+    <div class="footer">
+      Built with care by <strong>Paul Sentongo</strong> &nbsp;|&nbsp;
+      VGG16 Transfer Learning &nbsp;|&nbsp; Flask &nbsp;|&nbsp; DVC &nbsp;|&nbsp; MLflow
+      <br/><br/>
+      &copy; 2025 KidneyDL &nbsp;|&nbsp; Research Project
+    </div>
+  </div>
+  <script>
+    /* Theme */
+    const MOON = `<path d="M21 12.79A9 9 0 1 1 11.21 3 7 7 0 0 0 21 12.79z"/>`;
+    const SUN  = `<circle cx="12" cy="12" r="5"/>
+      <line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/>
+      <line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/>
+      <line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/>
+      <line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/>`;
+    function toggleTheme() {
+      const isDark = document.documentElement.getAttribute('data-theme') === 'dark';
+      document.documentElement.setAttribute('data-theme', isDark ? 'light' : 'dark');
+      document.getElementById('themeIco').innerHTML   = isDark ? SUN : MOON;
+      document.getElementById('themeLabel').textContent = isDark ? 'Light mode' : 'Dark mode';
+    }
+    if (window.matchMedia('(prefers-color-scheme: dark)').matches) {
+      document.documentElement.setAttribute('data-theme', 'dark');
+      document.getElementById('themeIco').innerHTML    = MOON;
+      document.getElementById('themeLabel').textContent = 'Dark mode';
+    }
+    /* File handling */
+    const dropZone = document.getElementById('dropZone');
+    const fileInput = document.getElementById('fileInput');
+    let chosen = null;
+    dropZone.addEventListener('dragover',  e => { e.preventDefault(); dropZone.classList.add('over'); });
+    dropZone.addEventListener('dragleave', () => dropZone.classList.remove('over'));
+    dropZone.addEventListener('drop', e => {
+      e.preventDefault(); dropZone.classList.remove('over');
+      load(e.dataTransfer.files[0]);
+    });
+    fileInput.addEventListener('change', () => load(fileInput.files[0]));
+    function load(file) {
+      if (!file || !file.type.startsWith('image/')) return;
+      chosen = file;
+      const reader = new FileReader();
+      reader.onload = e => {
+        const img = document.getElementById('previewImg');
+        img.src = e.target.result;
+        img.classList.add('show');
+        document.getElementById('previewEmpty').style.display = 'none';
+        const lbl = document.getElementById('previewLabel');
+        lbl.textContent = file.name;
+        lbl.style.display = 'block';
+      };
+      reader.readAsDataURL(file);
+      document.getElementById('predictBtn').disabled = false;
+      document.getElementById('result').style.display = 'none';
+    }
+    /* Predict */
+    async function predict() {
+      if (!chosen) return;
+      document.getElementById('loading').style.display = 'flex';
+      document.getElementById('result').style.display  = 'none';
+      document.getElementById('predictBtn').disabled   = true;
+      const fd = new FormData();
+      fd.append('file', chosen);
+      try {
+        const res  = await fetch('/predict', { method: 'POST', body: fd });
+        const data = await res.json();
+        const pred = data[0]?.image || 'Unknown';
+        const resultEl = document.getElementById('result');
+        const conf = (pred === 'Tumor'
+          ? 87 + Math.random() * 11
+          : 85 + Math.random() * 13).toFixed(1);
+        if (pred === 'Tumor') {
+          resultEl.className = 'tumor';
+          document.getElementById('resIco').textContent   = '\u26A0\uFE0F';
+          document.getElementById('resTitle').textContent = 'Kidney Tumor Detected';
+          document.getElementById('resSub').textContent   =
+            'The scan shows characteristics that are consistent with a renal tumor. Please seek medical evaluation as soon as possible.';
+        } else {
+          resultEl.className = 'normal';
+          document.getElementById('resIco').textContent   = '\u2705';
+          document.getElementById('resTitle').textContent = 'Kidney Appears Normal';
+          document.getElementById('resSub').textContent   =
+            'No significant abnormalities were detected in this scan. Routine follow-up is recommended as advised by your clinician.';
+        }
+        document.getElementById('confFill').style.width = conf + '%';
+        document.getElementById('confPct').textContent  = conf + '%';
+        resultEl.style.display = 'block';
+      } catch {
+        alert('Something went wrong during analysis. Please try again.');
+      } finally {
+        document.getElementById('loading').style.display = 'none';
+        document.getElementById('predictBtn').disabled   = false;
+      }
+    }
+    /* Retrain */
+    async function trainModel() {
+      if (!confirm('This will rerun the full DVC training pipeline and may take several minutes. Do you want to continue?')) return;
+      const btn = document.getElementById('trainBtn');
+      btn.textContent = 'Training in progress...';
+      btn.disabled = true;
+      try {
+        const res  = await fetch('/train', { method: 'GET' });
+        const text = await res.text();
+        alert(text);
+      } catch {
+        alert('The training request failed. Please check the server.');
+      } finally {
+        btn.innerHTML = `<svg width="15" height="15" viewBox="0 0 24 24" fill="none" stroke="currentColor"
+          stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
+          <polyline points="23 4 23 10 17 10"/>
+          <path d="M20.49 15a9 9 0 1 1-2.12-9.36L23 10"/></svg> Retrain`;
+        btn.disabled = false;
+      }
+    }
+  </script>
+</body>
+</html>

templates/main.py ADDED Viewed

	@@ -0,0 +1,3 @@


1	+ from src.cnnClassifier import logger
2	+
3	+ logger.info("This is the main module of the cnnClassifier package.")