Update README.md
Browse files
README.md
CHANGED
|
@@ -5,7 +5,13 @@ tags: []
|
|
| 5 |
|
| 6 |
# Model Card for Model ID
|
| 7 |
|
| 8 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
The issue is currently being fixed here: https://github.com/babylm/evaluation-pipeline-2025/issues/34
|
| 10 |
|
| 11 |
After this is fixed, this model should perform quite poorly, as expected.
|
|
@@ -13,7 +19,5 @@ After this is fixed, this model should perform quite poorly, as expected.
|
|
| 13 |
-------------------
|
| 14 |
|
| 15 |
|
| 16 |
-
|
| 17 |
-
I found it amusing that it nevertheless does very well on EWoK, Entity Tracking, Adjective Nominalization, COMPS, and AoA.
|
| 18 |
-
Maybe this says something about ourselves, how so many in society fail upwards... food for thought.
|
| 19 |
|
|
|
|
| 5 |
|
| 6 |
# Model Card for Model ID
|
| 7 |
|
| 8 |
+
This is a model I accidentally trained with too low a batch size, causing the training loss to spike and essentially fail.
|
| 9 |
+
I found it amusing that it nevertheless does very well on EWoK, Entity Tracking, Adjective Nominalization, COMPS, and AoA.
|
| 10 |
+
Maybe this says something about ourselves, how so many in society fail upwards... food for thought.
|
| 11 |
+
|
| 12 |
+
|
| 13 |
+
### UPDATE
|
| 14 |
+
Thanks to the work of my student [Serdar Gülbahar](https://github.com/serdardoesml), the reason for this model scoring well has been traced to a few bugs in the babylm evaluation pipeline.
|
| 15 |
The issue is currently being fixed here: https://github.com/babylm/evaluation-pipeline-2025/issues/34
|
| 16 |
|
| 17 |
After this is fixed, this model should perform quite poorly, as expected.
|
|
|
|
| 19 |
-------------------
|
| 20 |
|
| 21 |
|
| 22 |
+
|
|
|
|
|
|
|
| 23 |
|