Update model card with evaluation results from goldset from ner_disorderfinding_de_goldset
Browse files
README.md
CHANGED
|
@@ -68,3 +68,38 @@ The following hyperparameters were used during training:
|
|
| 68 |
- Pytorch 2.6.0+cu124
|
| 69 |
- Datasets 2.16.0
|
| 70 |
- Tokenizers 0.20.3
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 68 |
- Pytorch 2.6.0+cu124
|
| 69 |
- Datasets 2.16.0
|
| 70 |
- Tokenizers 0.20.3
|
| 71 |
+
|
| 72 |
+
|
| 73 |
+
## Model Performance
|
| 74 |
+
|
| 75 |
+
This model has been evaluated on the **goldset from ner_disorderfinding_de_goldset** using
|
| 76 |
+
IO evaluation (sklearn, token level, lenient) with the following results:
|
| 77 |
+
|
| 78 |
+
### Overall Performance
|
| 79 |
+
|
| 80 |
+
| Metric | Score |
|
| 81 |
+
|--------|-------|
|
| 82 |
+
| Precision (Macro) | 0.425082 |
|
| 83 |
+
| Recall (Macro) | 0.467785 |
|
| 84 |
+
| F1-Score (Macro) | 0.435900 |
|
| 85 |
+
| Precision (Weighted) | 0.600185 |
|
| 86 |
+
| Recall (Weighted) | 0.698514 |
|
| 87 |
+
| F1-Score (Weighted) | 0.640943 |
|
| 88 |
+
|
| 89 |
+
**Inference Performance**: 5.51 seconds for evaluation dataset
|
| 90 |
+
|
| 91 |
+
### Entity-Level Performance (IO Evaluation)
|
| 92 |
+
|
| 93 |
+
| Entity Type | Precision | Recall | F1-Score | Support |
|
| 94 |
+
|-------------|-----------|--------|----------|---------|
|
| 95 |
+
| DISORDER_FINDING | 0.753771 | 0.900890 | 0.820790 | N/A |
|
| 96 |
+
|
| 97 |
+
### Evaluation Details
|
| 98 |
+
|
| 99 |
+
- **Dataset**: goldset from ner_disorderfinding_de_goldset
|
| 100 |
+
- **Dataset Source**: goldset
|
| 101 |
+
- **Evaluation Date**: 2025-09-25 09:43:46
|
| 102 |
+
- **Language**: de
|
| 103 |
+
- **Entities**: DISORDER_FINDING
|
| 104 |
+
|
| 105 |
+
*This evaluation section is automatically generated and updated.*
|