Upload README.md via DNA Console (Portable Version)
Browse files
README.md
CHANGED
|
@@ -7,12 +7,12 @@ tags:
|
|
| 7 |
library_name: sklearn
|
| 8 |
---
|
| 9 |
|
| 10 |
-
# BioGuard DNA Classifier Ensemble (Portable v1.
|
| 11 |
|
| 12 |
This repository contains a dual-model ensemble for DNA sequence analysis and virus classification, optimized for **portability and zero-dependency loading**.
|
| 13 |
|
| 14 |
> [!NOTE]
|
| 15 |
-
> **Version 1.
|
| 16 |
|
| 17 |
## 🧬 Models Included
|
| 18 |
|
|
@@ -23,16 +23,16 @@ This repository hosts two distinct models specialized for different aspects of g
|
|
| 23 |
* **Purpose**: General-purpose synthetic vs. biological sequence classification.
|
| 24 |
* **Architecture**: Random Forest (sklearn) with biological feature extraction (k-mers, GC content, etc.).
|
| 25 |
* **Performance (Test Set)**:
|
| 26 |
-
* **Accuracy**:
|
| 27 |
-
* **F1 Score**: 89.
|
| 28 |
|
| 29 |
### 2. **ViralBoost** (GradientBoostingClassifier)
|
| 30 |
* **File**: `sequence_model.joblib`
|
| 31 |
* **Purpose**: Specific virus type identification (Influenza A, Norovirus, etc.) based on sequence signatures.
|
| 32 |
* **Architecture**: Gradient Boosting (sklearn) trained on real-world viral sequences.
|
| 33 |
* **Performance (Test Set)**:
|
| 34 |
-
* **Accuracy**: 99.
|
| 35 |
-
* **F1 Score**: 99.
|
| 36 |
* **Classes**: Other, Influenza A, Chicken anemia virus, Norovirus, Influenza B
|
| 37 |
|
| 38 |
## 🚀 Usage
|
|
|
|
| 7 |
library_name: sklearn
|
| 8 |
---
|
| 9 |
|
| 10 |
+
# BioGuard DNA Classifier Ensemble (Portable v1.3)
|
| 11 |
|
| 12 |
This repository contains a dual-model ensemble for DNA sequence analysis and virus classification, optimized for **portability and zero-dependency loading**.
|
| 13 |
|
| 14 |
> [!NOTE]
|
| 15 |
+
> **Version 1.3 Update**: This version has been fully refactored to remove all custom class dependencies from `.joblib` files. Feature extraction is now strictly handled via source code.
|
| 16 |
|
| 17 |
## 🧬 Models Included
|
| 18 |
|
|
|
|
| 23 |
* **Purpose**: General-purpose synthetic vs. biological sequence classification.
|
| 24 |
* **Architecture**: Random Forest (sklearn) with biological feature extraction (k-mers, GC content, etc.).
|
| 25 |
* **Performance (Test Set)**:
|
| 26 |
+
* **Accuracy**: 88.9%
|
| 27 |
+
* **F1 Score**: 89.2%
|
| 28 |
|
| 29 |
### 2. **ViralBoost** (GradientBoostingClassifier)
|
| 30 |
* **File**: `sequence_model.joblib`
|
| 31 |
* **Purpose**: Specific virus type identification (Influenza A, Norovirus, etc.) based on sequence signatures.
|
| 32 |
* **Architecture**: Gradient Boosting (sklearn) trained on real-world viral sequences.
|
| 33 |
* **Performance (Test Set)**:
|
| 34 |
+
* **Accuracy**: 99.3%
|
| 35 |
+
* **F1 Score**: 99.3%
|
| 36 |
* **Classes**: Other, Influenza A, Chicken anemia virus, Norovirus, Influenza B
|
| 37 |
|
| 38 |
## 🚀 Usage
|