Add evaluation metrics to model card (exact match 0.525, avg tag F1 0.886)
Browse files
README.md
CHANGED
|
@@ -13,6 +13,22 @@ tags:
|
|
| 13 |
datasets:
|
| 14 |
- LoveJesus/biblical-tutor-dataset-chirho
|
| 15 |
pipeline_tag: text2text-generation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
---
|
| 17 |
|
| 18 |
# Biblical Morphological Parser (mT5-small)
|
|
@@ -85,6 +101,43 @@ class:{pos} | stem:{stem} | lemma:{lemma} | morph:{code} | person:{p} | gender:{
|
|
| 85 |
- Handles individual words, not full syntactic analysis
|
| 86 |
- Performance may vary on words not well-represented in training data
|
| 87 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 88 |
---
|
| 89 |
|
| 90 |
Built with love for Jesus. Published by [LoveJesus](https://huggingface.co/LoveJesus).
|
|
|
|
| 13 |
datasets:
|
| 14 |
- LoveJesus/biblical-tutor-dataset-chirho
|
| 15 |
pipeline_tag: text2text-generation
|
| 16 |
+
model-index:
|
| 17 |
+
- name: biblical-parser-chirho
|
| 18 |
+
results:
|
| 19 |
+
- task:
|
| 20 |
+
type: text2text-generation
|
| 21 |
+
name: Morphological Parsing
|
| 22 |
+
dataset:
|
| 23 |
+
type: LoveJesus/biblical-tutor-dataset-chirho
|
| 24 |
+
name: Biblical Tutor Dataset (Chirho)
|
| 25 |
+
metrics:
|
| 26 |
+
- type: exact_match
|
| 27 |
+
value: 0.525
|
| 28 |
+
name: Exact Match
|
| 29 |
+
- type: f1
|
| 30 |
+
value: 0.886
|
| 31 |
+
name: Average Tag F1
|
| 32 |
---
|
| 33 |
|
| 34 |
# Biblical Morphological Parser (mT5-small)
|
|
|
|
| 101 |
- Handles individual words, not full syntactic analysis
|
| 102 |
- Performance may vary on words not well-represented in training data
|
| 103 |
|
| 104 |
+
## Evaluation Results
|
| 105 |
+
|
| 106 |
+
Evaluated on a held-out test set of ~20K word-level parsing examples.
|
| 107 |
+
|
| 108 |
+
### Overall Metrics
|
| 109 |
+
|
| 110 |
+
| Metric | Score |
|
| 111 |
+
|--------|-------|
|
| 112 |
+
| **Exact Match** (all tags correct) | **0.525** |
|
| 113 |
+
| **Average Tag F1** (across all tags) | **0.886** |
|
| 114 |
+
|
| 115 |
+
### Per-Tag F1
|
| 116 |
+
|
| 117 |
+
| Tag | F1 |
|
| 118 |
+
|-----|-----|
|
| 119 |
+
| class (POS) | 0.963 |
|
| 120 |
+
| number | 0.966 |
|
| 121 |
+
| POS | 0.958 |
|
| 122 |
+
| lemma | 0.935 |
|
| 123 |
+
| person | 0.933 |
|
| 124 |
+
| gender | 0.928 |
|
| 125 |
+
| type | 0.900 |
|
| 126 |
+
| morph | 0.890 |
|
| 127 |
+
| state | 0.878 |
|
| 128 |
+
| stem | 0.859 |
|
| 129 |
+
| gloss | 0.539 |
|
| 130 |
+
|
| 131 |
+
### Per-Language Exact Match
|
| 132 |
+
|
| 133 |
+
| Language | Exact Match |
|
| 134 |
+
|----------|-------------|
|
| 135 |
+
| Hebrew | 0.514 |
|
| 136 |
+
| Greek | 0.559 |
|
| 137 |
+
|
| 138 |
+
> The `gloss` tag (English translation) is the hardest to predict exactly, pulling down the overall exact match rate. The model achieves strong F1 on structural/morphological tags (class, number, POS, person, gender all > 0.92).
|
| 139 |
+
|
| 140 |
+
|
| 141 |
---
|
| 142 |
|
| 143 |
Built with love for Jesus. Published by [LoveJesus](https://huggingface.co/LoveJesus).
|