VParka commited on
Commit
03b1034
·
verified ·
1 Parent(s): 428d0e6

Upload README.md via DNA Console

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - biology
4
+ - genomics
5
+ - classification
6
+ - sklearn
7
+ library_name: sklearn
8
+ ---
9
+
10
+ # BioGuard DNA Classifier Ensemble
11
+
12
+ This repository contains a dual-model ensemble for DNA sequence analysis and virus classification, trained using the **DNA Governance Console**.
13
+
14
+ ## 🧬 Models Included
15
+
16
+ This repository hosts two distinct models specialized for different aspects of genomic analysis:
17
+
18
+ ### 1. **GenetiForest** (RandomForestClassifier)
19
+ * **File**: `dna_classifier.joblib`
20
+ * **Purpose**: General-purpose synthetic vs. biological sequence classification.
21
+ * **Architecture**: Random Forest (sklearn) with biological feature extraction (k-mers, GC content, etc.).
22
+ * **Performance (Test Set)**:
23
+ * **Accuracy**: 89.4%
24
+ * **F1 Score**: 89.4%
25
+
26
+ ### 2. **ViralBoost** (GradientBoostingClassifier)
27
+ * **File**: `sequence_model.joblib`
28
+ * **Purpose**: Specific virus type identification (Influenza A, Norovirus, etc.) based on sequence signatures.
29
+ * **Architecture**: Gradient Boosting (sklearn) trained on real-world viral sequences.
30
+ * **Performance (Test Set)**:
31
+ * **Accuracy**: 99.4%
32
+ * **F1 Score**: 99.4%
33
+ * **Classes**: Other, Influenza A, Chicken anemia virus, Norovirus, Influenza B
34
+
35
+ ## 🚀 Usage
36
+
37
+ You can load these models using `joblib` in Python:
38
+
39
+ ```python
40
+ import joblib
41
+
42
+ # Load GenetiForest
43
+ rf_model = joblib.load("dna_classifier.joblib")
44
+
45
+ # Load ViralBoost
46
+ gb_model = joblib.load("sequence_model.joblib")
47
+
48
+ # Prediction
49
+ # (Requires matching FeatureExtractor - see 'sequence_extractor.joblib')
50
+ ```
51
+
52
+ ## 📊 Training Meta
53
+ * **Generated By**: DNA Governance Console (vparka)
54
+ * **Framework**: scikit-learn