mhjiang0408 commited on
Commit
0b3f0c1
·
verified ·
1 Parent(s): 871631d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +109 -24
README.md CHANGED
@@ -6,40 +6,125 @@ language:
6
  pipeline_tag: text-generation
7
  library_name: transformers
8
  ---
9
-
10
- # GLM-4.5-Air
11
-
12
  <div align="center">
13
- <img src=https://raw.githubusercontent.com/zai-org/GLM-4.5/refs/heads/main/resources/logo.svg width="15%"/>
 
 
14
  </div>
15
- <p align="center">
16
- 👋 Join our <a href="https://discord.gg/QR7SARHRxK" target="_blank">Discord</a> community.
17
- <br>
18
- 📖 Check out the GLM-4.5 <a href="https://z.ai/blog/glm-4.5" target="_blank">technical blog</a>.
19
- <br>
20
- 📍 Use GLM-4.5 API services on <a href="https://docs.z.ai/guides/llm/glm-4.5">Z.ai API Platform (Global)</a> or <br> <a href="https://docs.bigmodel.cn/cn/guide/models/text/glm-4.5">Zhipu AI Open Platform (Mainland China)</a>.
21
- <br>
22
- 👉 One click to <a href="https://chat.z.ai">GLM-4.5</a>.
23
- </p>
24
-
25
- ## Model Introduction
26
 
27
- The **GLM-4.5** series models are foundation models designed for intelligent agents. GLM-4.5 has **355** billion total parameters with **32** billion active parameters, while GLM-4.5-Air adopts a more compact design with **106** billion total parameters and **12** billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
- Both GLM-4.5 and GLM-4.5-Air are hybrid reasoning models that provide two modes: thinking mode for complex reasoning and tool usage, and non-thinking mode for immediate responses.
30
 
31
- We have open-sourced the base models, hybrid reasoning models, and FP8 versions of the hybrid reasoning models for both GLM-4.5 and GLM-4.5-Air. They are released under the MIT open-source license and can be used commercially and for secondary development.
 
 
32
 
33
- As demonstrated in our comprehensive evaluation across 12 industry-standard benchmarks, GLM-4.5 achieves exceptional performance with a score of **63.2**, in the **3rd** place among all the proprietary and open-source models. Notably, GLM-4.5-Air delivers competitive results at **59.8** while maintaining superior efficiency.
34
 
35
- ![bench](https://raw.githubusercontent.com/zai-org/GLM-4.5/refs/heads/main/resources/bench.png)
36
 
37
- For more eval results, show cases, and technical details, please visit
38
- our [technical blog](https://z.ai/blog/glm-4.5). The technical report will be released soon.
 
 
39
 
40
 
41
- The model code, tool parser and reasoning parser can be found in the implementation of [transformers](https://github.com/huggingface/transformers/tree/main/src/transformers/models/glm4_moe), [vLLM](https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/glm4_moe_mtp.py) and [SGLang](https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/models/glm4_moe.py).
 
 
 
 
 
42
 
43
  ## Quick Start
44
 
45
- Please refer our [github page](https://github.com/zai-org/GLM-4.5) for more detail.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  pipeline_tag: text-generation
7
  library_name: transformers
8
  ---
 
 
 
9
  <div align="center">
10
+
11
+ <img src="assets/sii.jpg" alt="SII" width="96" height="96">
12
+ <img src="assets/asi.png" alt="ASI" width="96" height="96">
13
  </div>
 
 
 
 
 
 
 
 
 
 
 
14
 
15
+ ---
16
+ tags:
17
+ - text-generation
18
+ - agent
19
+ - tool-use
20
+ - long-context
21
+ license: other
22
+ language:
23
+ - en
24
+ pipeline_tag: text-generation
25
+ ---
26
+ # LIMI‑Air: Less is More for Agency
27
+
28
+ ## 📌 Table of Contents
29
+ - [Overview](#overview)
30
+ - [Model Details](#model-details)
31
+ - [Dataset](#dataset)
32
+ - [Quick Start](#quick-start)
33
+ - [Prompting](#prompting)
34
+ - [Evaluation](#evaluation)
35
+ - [Limitations](#limitations)
36
+ - [License](#license)
37
+ - [Citation](#citation)
38
+
39
+ ## Overview
40
+
41
+ LIMI‑Air is a smaller, faster agentic variant built on [GLM‑4.5‑Air](https://huggingface.co/zai-org/GLM-4.5-Air) (~106B), fine‑tuned with the same compact, high‑quality agentic data as LIMI.
42
 
43
+ ## Model Details
44
 
45
+ - Base model: `zai-org/GLM-4.5-Air`
46
+ - Params: ~106B
47
+ - Framework: slime; Data: [GAIR/LIMI](https://huggingface.co/datasets/GAIR/LIMI)
48
 
49
+ ## Model Zoo
50
 
51
+ Our LIMO model is available on Hugging Face 🤗:
52
 
53
+ | Model | Backbone | Size | Link |
54
+ |---|---|---|---|
55
+ | LIMI | [GLM‑4.5](https://huggingface.co/zai-org/GLM-4.5) | 355B | https://huggingface.co/GAIR/LIMI |
56
+ | LIMI‑Air | [GLM‑4.5‑Air](https://huggingface.co/zai-org/GLM-4.5-Air) | 106B | https://huggingface.co/GAIR/LIMI-Air |
57
 
58
 
59
+ ## Datasets
60
+
61
+ We release our datasets through Hugging Face 🤗:
62
+ - Name: `GAIR/LIMI`
63
+ - Summary: curated agentic SFT data (OpenAI `messages`, optional `tools`, normalized tool‑call arguments); current release contains 78 high‑quality samples.
64
+ - Link: https://huggingface.co/datasets/GAIR/LIMI
65
 
66
  ## Quick Start
67
 
68
+ <details>
69
+ <summary>Start with HF Transformers</summary>
70
+
71
+ ```python
72
+ from transformers import AutoModelForCausalLM, AutoTokenizer
73
+ import torch
74
+ model = AutoModelForCausalLM.from_pretrained(
75
+ "GAIR/LIMI-Air", torch_dtype="auto", device_map="auto", trust_remote_code=True
76
+ )
77
+ tok = AutoTokenizer.from_pretrained("GAIR/LIMI-Air", trust_remote_code=True)
78
+ text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
79
+ out = model.generate(
80
+ **tok(text, return_tensors="pt").to(model.device),
81
+ max_new_tokens=4096,
82
+ temperature=0.6,
83
+ top_p=0.95,
84
+ do_sample=True,
85
+ )
86
+ ```
87
+
88
+ </details>
89
+
90
+ <details>
91
+ <summary>Start with VLLM</summary>
92
+
93
+ ```python
94
+ from vllm import LLM, SamplingParams
95
+ from transformers import AutoTokenizer
96
+ llm = LLM(model="GAIR/LIMI-Air", trust_remote_code=True)
97
+ tok = AutoTokenizer.from_pretrained("GAIR/LIMI-Air", trust_remote_code=True)
98
+ text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
99
+ out = llm.generate(text, SamplingParams(temperature=0.6, max_tokens=4096, top_p=0.95))
100
+ ```
101
+
102
+ </details>
103
+
104
+ ## Prompting
105
+
106
+ Same as LIMI; provide messages in OpenAI chat format, optionally with `tools`. Include a grounding system message when helpful.
107
+
108
+ ## Evaluation
109
+
110
+ Uses the same metrics (FTFC, SR@R, RC@R at R=3) and protocol as LIMI; see the paper for comparative results.
111
+
112
+ ## Limitations
113
+
114
+ - Inherits base model constraints; validated on curated agentic tasks only
115
+ - Lower compute cost with potential performance trade‑offs on complex tasks
116
+
117
+ ## License
118
+
119
+ - Inherits GLM‑4.5‑Air terms; verify upstream license before deployment
120
+
121
+ ## Citation
122
+
123
+ ```bibtex
124
+ @article{LIMI2025,
125
+ title = {Less is More for Agentic Intelligence},
126
+ author = {LIMI Authors},
127
+ year = {2025},
128
+ journal = {arXiv preprint arXiv:2502.03387}
129
+ }
130
+ ```