Explainable Mutation Pathogenicity Predictor
Dual-output CNN model for predicting whether a DNA mutation is pathogenic or benign, with explicit explainability.
Key Features
• Predicts pathogenic vs benign mutation
• Provides mutation importance score
• Explainability enforced via contrastive importance loss
• Trained on ClinVar GRCh38 dataset
• Supports genome-wide variants across chromosomes
Model Architecture
Input encoding (1101 dim):
• Reference sequence (99bp one-hot)
• Mutated sequence (99bp one-hot)
• Difference mask
• Mutation type one-hot
Architecture:
CNN → Feature extraction →
→ Classification head (pathogenic prediction)
→ Importance head (mutation explainability)
Performance
Dataset: 16000 mutations (8000 pathogenic, 8000 benign)
Test metrics:
Accuracy: 57.5%
AUC: 0.6166
F1 score: 0.5896
Explainability:
Mutation importance score: 0.83 average
Contrastive gap: 0.409
Explainability prioritized over raw accuracy.
Explainability Mechanism
Importance head directly predicts importance of mutation location.
Contrastive loss ensures:
importance(mutation position) > importance(context)
This ensures causal explainability.
Usage
from huggingface_hub import hf_hub_download
import torch
model_path = hf_hub_download(
repo_id="nileshhanotia/mutation-pathogenicity-predictor",
filename="pytorch_model.pth"
)
checkpoint = torch.load(model_path)
⚠️ Research Use Only
This model is a research prototype and is NOT intended for clinical or diagnostic use.
Predictions should not be used for medical decision making.
- Downloads last month
- 89