mhjiang0408 commited on
Commit
1218a24
·
verified ·
1 Parent(s): 62d69b6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -6
README.md CHANGED
@@ -46,12 +46,19 @@ LIMI is an agentic model fine‑tuned from [GLM‑4.5](https://huggingface.co/za
46
  - Training framework: slime
47
  - Training data: curated conversations from [GAIR/LIMI](https://huggingface.co/datasets/GAIR/LIMI)
48
 
49
- ## Key Results
50
-
51
- | Model | Agency Bench FTFC | Agency Bench SR | Agency Bench RC | Training Samples |
52
- |-------|--------|---------|---------|-----------------|
53
- | LIMI (Ours) | **71.7** | **74.2** |**74.6**| 78 |
54
- | GLM-4.5 | 37.8 | 50.0 | 47.4 | N/A |
 
 
 
 
 
 
 
55
 
56
  ## Model Zoo
57
 
 
46
  - Training framework: slime
47
  - Training data: curated conversations from [GAIR/LIMI](https://huggingface.co/datasets/GAIR/LIMI)
48
 
49
+ ## Performance on AgencyBench
50
+
51
+ Our models achieve state-of-the-art performance across multiple agentic evaluation tasks:
52
+
53
+ | Model | FTFC () | RC@3 (↑) | SR@3 (↑) | Avg. |
54
+ |-------|----------|----------|----------|-----------------|
55
+ | GLM-4.5-Air | 15.0 | 16.1 | 20.0 | 17.0 |
56
+ | GLM-4.5 | 37.8 | 50.0 | 47.4 | 45.1 |
57
+ |GLM-4.5-CodeAgent| 48.0 | 48.0|47.5| 47.8|
58
+ | **LIMI-Air** | **35.4** | **34.3** | **33.1** | **34.3** |
59
+ | **LIMI** | **71.7** | **74.2** | **74.6** | **73.5** |
60
+
61
+ For detailed benchmark results, experimental setup, and comprehensive comparisons, please refer to our [paper](https://arxiv.org/pdf/2509.17567).
62
 
63
  ## Model Zoo
64