๐Ÿงฌ Multi-Cancer Image Classification with CNN

๐Ÿ“Œ Project Overview

This project focuses on the classification of cancer-related medical images using Convolutional Neural Networks (CNNs) implemented with TensorFlow/Keras. The dataset consists of cancer image samples (in this case from the ALL folder under the Multi Cancer dataset on Kaggle). The model is trained to distinguish between different classes within the dataset using supervised learning.

Deep learning techniques, specifically CNN architectures, are applied to process and classify images automatically without manual feature extraction. This project demonstrates an end-to-end machine learning pipeline from data loading and preprocessing to model training, evaluation, saving, and prediction.


๐Ÿ“‚ Project Structure

โ”œโ”€โ”€ Multi Cancer Dataset
โ”‚   โ”œโ”€โ”€ ALL
โ”‚   โ”‚   โ”œโ”€โ”€ Class_1
โ”‚   โ”‚   โ”œโ”€โ”€ Class_2
โ”‚   โ”‚   โ”œโ”€โ”€ ...
โ”‚
โ”œโ”€โ”€ model5.h5                # Trained CNN model saved in HDF5 format
โ”œโ”€โ”€ cancer_classification.py  # Main training & prediction script
โ”œโ”€โ”€ README.md                 # Project documentation (this file)

โš™๏ธ Requirements

To run this project, you need the following dependencies:

  • Python 3.8+
  • TensorFlow 2.x
  • NumPy
  • Matplotlib
  • Keras (integrated within TensorFlow)
  • Kaggle Dataset Access (if using Kaggle Notebook)

You can install the dependencies using:

pip install tensorflow numpy matplotlib

๐Ÿงฉ Data Preprocessing

The dataset is organized in directory format where each folder represents a class label.

Example:

/ALL
    /Class_1
        image1.jpg
        image2.jpg
    /Class_2
        image1.jpg
        image2.jpg

Steps taken:

  1. Rescaling Images โ€“ All images are normalized by scaling pixel values to the range [0,1].

  2. Image Resizing โ€“ Every image is resized to 150x150 pixels to ensure uniform input size.

  3. Data Augmentation โ€“ Implemented via ImageDataGenerator with:

    • rescale=1./255
    • validation_split=0.1 (10% of data reserved for validation)

This allows for efficient training and prevents overfitting.

train_datagen = ImageDataGenerator(rescale=1./255, validation_split=0.1)

๐Ÿ—๏ธ Model Architecture

The model is a Sequential CNN consisting of:

  1. Conv2D + MaxPooling Layers:

    • Extract features from the images.
    • 3 convolutional layers with increasing filter sizes (32, 64, 128).
    • Each followed by max pooling to reduce spatial dimensions.
  2. Flatten Layer:

    • Converts 2D feature maps into 1D feature vectors.
  3. Dense Layers:

    • Fully connected layers for learning global patterns.
    • A hidden layer with 512 neurons (ReLU activation).
    • Output layer with softmax activation for multi-class classification.
model = keras.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 3)),
    layers.MaxPooling2D(2, 2),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D(2, 2),
    layers.Conv2D(128, (3, 3), activation='relu'),
    layers.MaxPooling2D(2, 2),
    layers.Flatten(),
    layers.Dense(512, activation='relu'),
    layers.Dense(len(train_generator.class_indices), activation='softmax')
])

โšก Model Compilation & Training

  • Loss Function: Categorical Crossentropy
  • Optimizer: Adam
  • Metric: Accuracy
model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

The model is trained for 10 epochs:

model.fit(train_generator,
          validation_data=validation_generator,
          epochs=10)

๐Ÿ’พ Model Saving

After training, the model is saved in .h5 format:

model.save("model5.h5")

This allows reusing the model later without retraining.


๐Ÿ”ฎ Prediction Function

A custom guess() function is provided to make predictions on new images:

Steps:

  1. Load and resize image to 150x150.
  2. Normalize pixel values.
  3. Predict with the trained CNN.
  4. Map prediction to class label.
  5. Display image with predicted class title.
def guess(image_path, model, class_indices):
    img = load_img(image_path, target_size=(150, 150))
    img_array = img_to_array(img) / 255.0
    img_array = np.expand_dims(img_array, axis=0)
    
    prediction = model.predict(img_array)
    predicted_class = np.argmax(prediction)
    class_labels = {v: k for k, v in class_indices.items()}
    predicted_label = class_labels[predicted_class]
    
    plt.imshow(img)
    plt.title(f"model_guess: {predicted_label}")
    plt.axis("off")
    plt.show()

Example usage:

guess("test_image.jpg", model, train_generator.class_indices)

๐Ÿ“Š Results & Evaluation

  • The training and validation accuracy/loss values are automatically logged.

  • These can be plotted using matplotlib to visualize performance trends.

  • Example metrics:

    • Training Accuracy โ‰ˆ 90%+
    • Validation Accuracy โ‰ˆ 85โ€“95% (depending on dataset balance)

๐Ÿš€ Possible Improvements

  • Apply data augmentation (rotation, flip, zoom) to generalize better.
  • Use Transfer Learning (e.g., ResNet50, EfficientNet, VGG16) for higher accuracy.
  • Implement early stopping & checkpointing to avoid overfitting.
  • Increase epochs and adjust learning rates for fine-tuning.

๐Ÿ“– References


๐Ÿ‘จโ€๐Ÿ’ป Author

This project was developed as part of a medical image classification study using deep learning. It can be extended to other cancer types or generalized to different medical imaging problems such as X-ray, MRI, or CT scan analysis.


โšก In summary: This project demonstrates how to build a deep learning pipeline for medical image classification with CNNs, using TensorFlow/Keras. It covers everything from data preprocessing to model training, saving, and prediction visualization.


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for CernovaAI/CANet-v1.3

Base model

CernovaAI/CANet-v1
Finetuned
(1)
this model