ghzhang233 commited on
Commit
bf9351b
·
1 Parent(s): 3c0fc97

update readme

Browse files
Files changed (1) hide show
  1. README.md +9 -10
README.md CHANGED
@@ -1,10 +1,9 @@
1
- ---
2
- title: README
3
- emoji: 📉
4
- colorFrom: gray
5
- colorTo: yellow
6
- sdk: static
7
- pinned: false
8
- ---
9
-
10
- Edit this `README.md` markdown file to author your organization card.
 
1
+ # LM-Harmony
2
+
3
+ ![train-before-test](assets/banner.png)
4
+
5
+ *Which model would you rather have: the weaker student who crammed for the test, or the stronger student who walked in underprepared? Existing leaderboards mostly reward the former.*
6
+
7
+ **LM-Harmony** is a multi-task leaderboard for **model potential**. Instead of judging deployment-ready performance out of the box, we use a **train-before-test** paradigm: every model is fine-tuned on the same benchmark-specific training set before evaluation.
8
+
9
+ Across 24 diverse tasks, LM-Harmony yields far more stable and consistent rankings than standard direct-evaluation leaderboards. If you care about which model will perform better after you fine-tune it on your own data, the ranking you see here is much more likely to generalize to your workload.