Explainable Mutation Pathogenicity Predictor

Dual-output CNN model for predicting whether a DNA mutation is pathogenic or benign, with explicit explainability.


Key Features

• Predicts pathogenic vs benign mutation
• Provides mutation importance score
• Explainability enforced via contrastive importance loss
• Trained on ClinVar GRCh38 dataset
• Supports genome-wide variants across chromosomes


Model Architecture

Input encoding (1101 dim):

• Reference sequence (99bp one-hot)
• Mutated sequence (99bp one-hot)
• Difference mask
• Mutation type one-hot

Architecture:

CNN → Feature extraction →
→ Classification head (pathogenic prediction)
→ Importance head (mutation explainability)


Performance

Dataset: 16000 mutations (8000 pathogenic, 8000 benign)

Test metrics:

Accuracy: 57.5%
AUC: 0.6166
F1 score: 0.5896

Explainability:

Mutation importance score: 0.83 average
Contrastive gap: 0.409

Explainability prioritized over raw accuracy.


Explainability Mechanism

Importance head directly predicts importance of mutation location.

Contrastive loss ensures:

importance(mutation position) > importance(context)

This ensures causal explainability.


Usage

from huggingface_hub import hf_hub_download
import torch

model_path = hf_hub_download(
    repo_id="nileshhanotia/mutation-pathogenicity-predictor",
    filename="pytorch_model.pth"
)

checkpoint = torch.load(model_path)

⚠️ Research Use Only

This model is a research prototype and is NOT intended for clinical or diagnostic use.
Predictions should not be used for medical decision making.
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Spaces using nileshhanotia/mutation-pathogenicity-predictor 4