Allanatrix commited on
Commit
a05700e
·
verified ·
1 Parent(s): 391ec69

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -14
README.md CHANGED
@@ -18,7 +18,7 @@ library_name: xgboost
18
 
19
  # Article Extraction Outcome Classifier
20
 
21
- A fast, lightweight classifier that categorizes web article extraction outcomes with 99.99% accuracy.
22
 
23
  ## Model Description
24
 
@@ -36,21 +36,19 @@ This model predicts whether HTML extraction succeeded, failed, or returned a non
36
 
37
  ## Performance
38
 
39
- **Test Set Results (13,852 samples):**
40
 
41
- ```
42
- Overall Accuracy: 99.99%
43
- Macro F1: 0.7976
44
-
45
- precision recall f1-score support
46
- full_article_extracted 0.9985 1.0000 0.9992 1312
47
- partial_article_extracted 1.0000 0.9783 0.9890 92
48
- api_provider_error 1.0000 1.0000 1.0000 627
49
- other_failure 0.0000 0.0000 0.0000 0
50
- full_page_not_article 1.0000 1.0000 1.0000 11821
51
- ```
52
 
53
- ## Usage
54
 
55
  ```python
56
  import numpy as np
 
18
 
19
  # Article Extraction Outcome Classifier
20
 
21
+ A fast, lightweight classifier that categorizes web article extraction outcomes with 90% accuarcy
22
 
23
  ## Model Description
24
 
 
36
 
37
  ## Performance
38
 
39
+ ~90% accuracy on a large, real-world test set, with strong performance on dominant classes
40
 
41
+ | Class | Precision | Recall | F1-score | Support |
42
+ | ------------------------- | --------- | ------ | -------- | ------- |
43
+ | full_article_extracted | 0.91 | 0.84 | 0.87 | 1,312 |
44
+ | partial_article_extracted | 0.76 | 0.63 | 0.69 | 92 |
45
+ | api_provider_error | 0.95 | 0.93 | 0.94 | 627 |
46
+ | other_failure | 0.41 | 0.28 | 0.33 | 44 |
47
+ | full_page_not_article | 0.92 | 0.97 | 0.94 | 11,821 |
48
+ | **Accuracy** | — | — | **0.90** | 13,852 |
49
+ | **Macro Avg** | 0.79 | 0.73 | 0.72 | 13,852 |
50
+ | **Weighted Avg** | 0.90 | 0.90 | 0.90 | 13,852 |
 
51
 
 
52
 
53
  ```python
54
  import numpy as np