CocoEntertainment
/

CoALa-1-Pretuned

wikipedia-trained

Eval Results (legacy)

Model card Files Files and versions

CoCoGames commited on Jan 5

Commit

745c475

·

verified ·

1 Parent(s): 9158d89

Update README.md

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -70,13 +70,17 @@ CoALa-1 is a **Base Model (Pretrained)**. It has been trained to predict the nex
 ## Evaluation Results
-CoALa-1 was evaluated using the `lm-evaluation-harness`.
 | Benchmark | Metric | CoALa-1 (183M) | GPT-2 (124M) | OPT-125M |
 |---|---|---|---|---|
 | **ARC-Easy** | acc_norm | **28.87%** | 27.00% | 24.50% |
 | **HellaSwag** | acc_norm | **26.96%** | 28.50% | 26.00% |
 ## Technical Specifications
 * **Hidden Size:** 768

 ## Evaluation Results
+CoALa-1 was evaluated using the `lm-evaluation-harness`. It shows a strong performance in factual knowledge compared to other models in its weight class.
 | Benchmark | Metric | CoALa-1 (183M) | GPT-2 (124M) | OPT-125M |
 |---|---|---|---|---|
 | **ARC-Easy** | acc_norm | **28.87%** | 27.00% | 24.50% |
 | **HellaSwag** | acc_norm | **26.96%** | 28.50% | 26.00% |
+![Benchmark Comparison](benchmarks.png)
+> **Figure 1:** Comparison of ARC-Easy (Knowledge) and HellaSwag (Reasoning) scores. CoALa-1 leads in factual knowledge retrieval among sub-200M parameter models.
 ## Technical Specifications
 * **Hidden Size:** 768