Explainable Mutation Pathogenicity Predictor

Dual-output CNN model for predicting whether a DNA mutation is pathogenic or benign, with explicit explainability.

Key Features

• Predicts pathogenic vs benign mutation
• Provides mutation importance score
• Explainability enforced via contrastive importance loss
• Trained on ClinVar GRCh38 dataset
• Supports genome-wide variants across chromosomes

Model Architecture

Input encoding (1101 dim):

• Reference sequence (99bp one-hot)
• Mutated sequence (99bp one-hot)
• Difference mask
• Mutation type one-hot

Architecture:

CNN → Feature extraction →
→ Classification head (pathogenic prediction)
→ Importance head (mutation explainability)

Performance

Dataset: 16000 mutations (8000 pathogenic, 8000 benign)

Test metrics:

Accuracy: 57.5%
AUC: 0.6166
F1 score: 0.5896

Explainability:

Mutation importance score: 0.83 average
Contrastive gap: 0.409

Explainability prioritized over raw accuracy.

Explainability Mechanism

Importance head directly predicts importance of mutation location.

Contrastive loss ensures:

importance(mutation position) > importance(context)

This ensures causal explainability.

Usage

from huggingface_hub import hf_hub_download
import torch

model_path = hf_hub_download(
    repo_id="nileshhanotia/mutation-pathogenicity-predictor",
    filename="pytorch_model.pth"
)

checkpoint = torch.load(model_path)

⚠️ Research Use Only

This model is a research prototype and is NOT intended for clinical or diagnostic use.
Predictions should not be used for medical decision making.

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support