rrrr66254
/

Glossa-BART

text2text-generation

Model card Files Files and versions

rrrr66254 commited on Jun 24, 2025

Commit

1827d56

·

verified ·

1 Parent(s): 47f5f99

Update README.md

Files changed (1) hide show

README.md +17 -6

README.md CHANGED Viewed

@@ -4,6 +4,8 @@ language:
 - en
 metrics:
 - bertscore
 base_model:
 - facebook/bart-base
 ---
@@ -120,16 +122,25 @@ This model does not explicitly disaggregate results by demographic group, signer
 #### Metrics
-- **Primary metric**: BERTScore (F1)
 - **Model selection**: Best checkpoint based on highest validation BERTScore-F1
-- BERTScore is preferred for this task due to its alignment with semantic quality over token-level exactness (e.g., BLEU or ROUGE)
 ### Results
-After 2 epochs of training, the model achieved:
-- **BERTScore-F1**: 0.83 on held-out evaluation set of 500 gloss-reference pairs
-- Qualitative inspection confirms that most outputs are fluent and contextually aligned, though some suffer from missing function words or incorrect verb tenses.
 #### Summary
@@ -142,4 +153,4 @@ This model demonstrates strong potential for gloss-to-English translation, with
 ## Model Card Contact
-- rrrr66254@gmail.com

 - en
 metrics:
 - bertscore
+- bleu
+- rouge
 base_model:
 - facebook/bart-base
 ---
 #### Metrics
+- **Primary metric**: BERTScore (F1), BLEU, and ROUGE
 - **Model selection**: Best checkpoint based on highest validation BERTScore-F1
+- BERTScore is used to evaluate semantic alignment, while BLEU and ROUGE provide additional insight into surface-level n-gram overlap. All metrics were evaluated using the same held-out set of 500 gloss-reference pairs.
 ### Results
+After 2 epochs of training, the model achieved the following on the 500-pair evaluation set:
+- **BERTScore-F1**: 0.83
+- **BLEU Scores**:
+  - BLEU-1: 0.7063
+  - BLEU-2: 0.6175
+  - BLEU-3: 0.5479
+  - BLEU-4: 0.4821
+- **ROUGE Scores**:
+  - ROUGE-1: 0.7587
+  - ROUGE-2: 0.5874
+  - ROUGE-L: 0.7312
+- Qualitative inspection shows that most model outputs are fluent and contextually accurate. Common errors include omission of function words and minor verb tense mismatches.
 #### Summary
 ## Model Card Contact
+- rrrr66254@gmail.com