Add evaluation results

#3
by SaylorTwift HF Staff - opened
Files changed (1) hide show
  1. LFM2.5-230M.yaml +35 -0
LFM2.5-230M.yaml ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Model: LiquidAI/LFM2.5-230M
2
+ # Source: https://huggingface.co/LiquidAI/LFM2.5-230M
3
+ # Paper: https://arxiv.org/abs/2511.23404
4
+ # Extracted from model card benchmark tables (see blog post: https://www.liquid.ai/blog/lfm2-5-230m)
5
+ # Date extracted: 2026-06-26
6
+
7
+ # Missing benchmarks (no registered eval.yaml found on the Hub):
8
+ # - IFEval: 71.71
9
+ # - IFBench: 38.40
10
+ # - Multi-IF: 37.70
11
+ # - CaseReportBench: 22.51
12
+ # - BFCLv3: 43.26
13
+ # - BFCLv4: 21.03
14
+ # - τ²-Bench Telecom: 5.26
15
+ # - τ²-Bench Retail: 13.68
16
+
17
+ - dataset:
18
+ id: Idavidrein/gpqa
19
+ task_id: diamond
20
+ value: 25.41
21
+ date: "2026-06-25"
22
+ source:
23
+ url: https://huggingface.co/LiquidAI/LFM2.5-230M
24
+ name: "LFM2.5-230M model card"
25
+ notes: "GPQA Diamond score from the model card benchmark table"
26
+
27
+ - dataset:
28
+ id: TIGER-Lab/MMLU-Pro
29
+ task_id: mmlu_pro
30
+ value: 20.25
31
+ date: "2026-06-25"
32
+ source:
33
+ url: https://huggingface.co/LiquidAI/LFM2.5-230M
34
+ name: "LFM2.5-230M model card"
35
+ notes: "MMLU-Pro score from the model card benchmark table"