Update README.md
Browse files
README.md
CHANGED
|
@@ -24,12 +24,6 @@ A fast, lightweight classifier that categorizes web article extraction outcomes
|
|
| 24 |
|
| 25 |
This model predicts whether HTML extraction succeeded, failed, or returned a non-article page. It combines rule-based heuristics for speed with XGBoost for accuracy on ambiguous cases.
|
| 26 |
|
| 27 |
-
**Key Features:**
|
| 28 |
-
- Processes only first 64KB of HTML for speed
|
| 29 |
-
- 99.99% accuracy on test set
|
| 30 |
-
- Rule-based fast path handles 80%+ of cases instantly
|
| 31 |
-
- Only 26 hand-crafted features (no large embeddings)
|
| 32 |
-
|
| 33 |
## Classes
|
| 34 |
|
| 35 |
| Class | Description |
|
|
|
|
| 24 |
|
| 25 |
This model predicts whether HTML extraction succeeded, failed, or returned a non-article page. It combines rule-based heuristics for speed with XGBoost for accuracy on ambiguous cases.
|
| 26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
## Classes
|
| 28 |
|
| 29 |
| Class | Description |
|