Explainable Mutation Pathogenicity Predictor

Dual-output CNN model for predicting whether a DNA mutation is pathogenic or benign, with explicit explainability.


Key Features

• Predicts pathogenic vs benign mutation
• Provides mutation importance score
• Explainability enforced via contrastive importance loss
• Trained on ClinVar GRCh38 dataset
• Supports genome-wide variants across chromosomes


Model Architecture

Input encoding (1101 dim):

• Reference sequence (99bp one-hot)
• Mutated sequence (99bp one-hot)
• Difference mask
• Mutation type one-hot

Architecture:

CNN → Feature extraction →
→ Classification head (pathogenic prediction)
→ Importance head (mutation explainability)


Performance

Dataset: 16000 mutations (8000 pathogenic, 8000 benign)

Test metrics:

Accuracy: 57.5%
AUC: 0.6166
F1 score: 0.5896

Explainability:

Mutation importance score: 0.83 average
Contrastive gap: 0.409

Explainability prioritized over raw accuracy.


Explainability Mechanism

Importance head directly predicts importance of mutation location.

Contrastive loss ensures:

importance(mutation position) > importance(context)

This ensures causal explainability.


Usage

from huggingface_hub import hf_hub_download
import torch

model_path = hf_hub_download(
    repo_id="nileshhanotia/mutation-pathogenicity-predictor",
    filename="pytorch_model.pth"
)

checkpoint = torch.load(model_path)

⚠️ Research Use Only

This model is a research prototype and is NOT intended for clinical or diagnostic use.
Predictions should not be used for medical decision making.
Downloads last month
89
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using nileshhanotia/mutation-pathogenicity-predictor 1