ALYYAN's picture
Update README.md
15a1576 unverified
metadata
title: Chest Cancer Detection AI
emoji: 🩺
colorFrom: blue
colorTo: indigo
sdk: docker
app_file: app.py
pinned: false

🩺 End-to-End Chest Cancer Classification

An MLOps project demonstrating the complete lifecycle of a deep learning model, from data ingestion to CI/CD-powered deployment.

CI/CD Pipeline License: MIT Python Version


πŸš€ Live Demo

Experience the deployed application live on Hugging Face Spaces!

➑️ Live Demo Link

(Note: The Space may be asleep if it hasn't been used recently. Please allow a moment for it to wake up.)

πŸ–ΌοΈ Application Screenshot

Here is the user interface of the deployed web application.

Application Screenshot

πŸ“– About The Project

This project implements a complete end-to-end MLOps pipeline for a Chest Cancer image classification task. A deep learning model (based on VGG16/ResNet) is trained to distinguish between Normal and Adenocarcinoma chest CT scans.

The primary focus is not just on the model's accuracy, but on building a robust, reproducible, and automated system using modern MLOps tools.

Key Features:

  • Experiment Tracking: Uses MLflow to log parameters, metrics, and model artifacts for every run.
  • Data & Model Versioning: Uses DVC to version large data files and models, keeping the Git repository lightweight.
  • Automated CI/CD: A GitHub Actions workflow automatically tests, builds, and deploys the application on every push to the main branch.
  • Web Application: A user-friendly Flask application serves the trained model for real-time predictions.
  • Containerization: The entire application is containerized with Docker for consistent and portable deployment.
  • Cloud Deployment: Deployed for free on Hugging Face Spaces.

πŸ› οΈ Tech Stack

  • Backend: Python, Flask
  • Model: TensorFlow, Keras
  • MLOps Tools: MLflow, DVC, Dagshub (for remote tracking)
  • CI/CD & Deployment: Docker, GitHub Actions, Hugging Face Spaces

🌊 MLOps Workflow

The project follows a structured MLOps workflow, which is fully automated by the CI/CD pipeline.


graph TD
    A[Start: Push Code to GitHub] --> B{GitHub Actions CI/CD};
    B --> C[CI: Install Dependencies & Run Tests];
    C -->|Success| D[CD: Deploy to Hugging Face];
    D --> E[πŸš€ Live Application];

    subgraph "DVC & MLflow Cycle (Local/Remote)"
        F[1. `dvc repro`] --> G[2. Pull Data (DVC)];
        G --> H[3. Train Model];
        H --> I[4. Log Metrics & Model (MLflow)];
        I --> J[5. Push Model (DVC)];
    end

βš™οΈ Getting Started - Local Setup

To run this project on your local machine, follow these steps.

Prerequisites

  • Git
  • Python 3.8+
  • A DagsHub account (for MLflow tracking)

Installation & Setup

  1. Clone the repository:

    git clone https://github.com/AlyyanAhmed21/End-to-End-Chest-Cancer-Classification-using-MLflow-and-DVC.git
    cd End-to-End-Chest-Cancer-Classification-using-MLflow-and-DVC
    
  2. Create a virtual environment:

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    
  3. Install dependencies:

    pip install -r requirements.txt
    
  4. Set up MLflow/DagsHub credentials: Create a .env file in the root directory and add your credentials. This file is ignored by Git.

    MLFLOW_TRACKING_URI="https://dagshub.com/YourUsername/YourRepoName.mlflow"
    MLFLOW_TRACKING_USERNAME="YourUsername"
    MLFLOW_TRACKING_PASSWORD="YourDagsHubAccessToken"
    
  5. Run the DVC pipeline: This command will execute all stages defined in dvc.yaml (data ingestion, model preparation, training, and evaluation).

    dvc repro
    
  6. Run the Flask application:

    python app.py
    

    Open your browser and navigate to http://localhost:8080 to use the app.

πŸ“ Project Structure

.
β”œβ”€β”€ .dvc/                # DVC metadata
β”œβ”€β”€ .github/workflows/   # GitHub Actions CI/CD pipeline
β”œβ”€β”€ artifacts/           # (Generated by DVC) Data, models, etc.
β”œβ”€β”€ config/              # Configuration files (config.yaml)
β”œβ”€β”€ src/                 # Source code for the project
β”‚   └── cnnClassifier/
β”‚       β”œβ”€β”€ components/    # Individual pipeline components
β”‚       β”œβ”€β”€ config/        # Configuration management code
β”‚       β”œβ”€β”€ entity/        # Custom entity definitions
β”‚       β”œβ”€β”€ pipeline/      # DVC pipeline stage definitions
β”‚       └── utils/         # Utility functions
β”œβ”€β”€ templates/           # HTML templates for the Flask app
β”œβ”€β”€ .gitignore
β”œβ”€β”€ app.py               # Main Flask application entrypoint
β”œβ”€β”€ dvc.yaml             # DVC pipeline definition
β”œβ”€β”€ Dockerfile           # Docker configuration for deployment
β”œβ”€β”€ main.py              # Main project orchestrator
β”œβ”€β”€ params.yaml          # Model parameters
└── requirements.txt     # Python dependencies

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.


Star the repo if you found it useful! ⭐