nileshhanotia's picture
Upload README.md with huggingface_hub
3ed4e54 verified
metadata
language: en
license: apache-2.0
library_name: pytorch
pipeline_tag: text-classification
tags:
  - genomics
  - mutation
  - pathogenicity
  - splice
  - explainable-ai
  - biology
  - clinical-ai

🧬 MutationPredictorCNN_v2 — Splice-Aware Pathogenicity Predictor

Model Summary

MutationPredictorCNN_v2 is a splice-aware convolutional neural network designed to predict pathogenicity of single nucleotide variants using genomic sequence context and splice-aware features.

Supports built-in explainability:

• CNN activation heatmap
• Gradient attribution
• Counterfactual mutation analysis
• Feature ablation analysis
• Splice distance analysis

Validation accuracy: 74.8%


Intended Use

Research use cases:

• Genomic variant interpretation
• Explainable AI research
• Variant prioritization
• Educational and academic research

NOT intended for clinical diagnostic use.


Model Architecture

CNN-based architecture:

Input: 1106 features
Output: Pathogenicity probability

Explainability heads:

• Mutation importance
• Region importance
• Splice importance


Training Data

Source: ClinVar

Dataset size:

100,000 variants
50,000 pathogenic
50,000 benign

Sequence window: 99 bp


Performance

Validation accuracy:

74.8%

Balanced dataset.


Explainability

Provides multi-level explainability:

• Activation heatmap
• Mutation rank percentile
• Gradient attribution map
• Counterfactual analysis
• Feature ablation analysis


Limitations

Supports only:

• Single nucleotide variants
• 99 bp context window

Does not include:

• Conservation scores
• Protein structure
• Expression context


Disclaimer

⚠ Research use only
Not a clinical diagnostic tool


Maintainer

Nilesh Hanotia