metadata
tags:
- text-generation
- agent
- tool-use
- long-context
license: other
language:
- en
pipeline_tag: text-generation
LIMI: Less is More for Agency
📌 Table of Contents
Overview
LIMI is an agentic model fine‑tuned from GLM‑4.5 using compact, high‑quality data to emphasize:
- Targeted capabilities: tool use, multi‑turn correction, spec compliance
- Long‑context trajectory with tokenizer‑filtered samples (≤128k tokens)
- OpenAI‑style
messageswith optional function/tool calls
Model Details
- Base model:
zai-org/GLM-4.5 - Context: up to 128k tokens (training budget)
- Training framework: slime
- Training data: curated conversations from GAIR/LIMI
Key Results
| Model | Agency Bench FTFC | Agency Bench SR | Agency Bench RC | Training Samples |
|---|---|---|---|---|
| LIMI (Ours) | 71.7 | 74.2 | 74.6 | 78 |
| GLM-4.5 | 37.8 | 50.0 | 47.4 | 100k+ |
Model Zoo
Our LIMO model is available on Hugging Face 🤗:
| Model | Backbone | Size | Link |
|---|---|---|---|
| LIMI | GLM‑4.5 | 355B | https://huggingface.co/GAIR/LIMI |
| LIMI‑Air | GLM‑4.5‑Air | 106B | https://huggingface.co/GAIR/LIMI-Air |
Datasets
We release our datasets through Hugging Face 🤗:
- Name:
GAIR/LIMI - Summary: curated agentic SFT data (OpenAI
messages, optionaltools, normalized tool‑call arguments), filtered by tokenizer to ≤128k tokens; current release contains ~78 high‑quality samples. - Link: https://huggingface.co/datasets/GAIR/LIMI
Quick Start
Start with HF Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"GAIR/LIMI", torch_dtype="auto", device_map="auto", trust_remote_code=True
)
tok = AutoTokenizer.from_pretrained("GAIR/LIMI", trust_remote_code=True)
messages = [
{"role": "system", "content": "You are a helpful assistant tasked with discovering mathematical function structures for scientific systems."},
{"role": "user", "content": "Modify the equation.py function, considering the physical meaning and relationships of the inputs."}
]
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tok(text, return_tensors="pt").to(model.device)
out = model.generate(
**inputs,
max_new_tokens=4096,
temperature=0.6,
top_p=0.95,
do_sample=True,
)
print(tok.decode(out[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True))
Start with VLLM
from vllm import LLM, SamplingParams
from transformers import AutoTokenizer
llm = LLM(model="GAIR/LIMI", trust_remote_code=True)
tok = AutoTokenizer.from_pretrained("GAIR/LIMI", trust_remote_code=True)
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
out = llm.generate(text, SamplingParams(temperature=0.6, max_tokens=4096, top_p=0.95))
print(out[0].outputs[0].text)
Prompting
- Messages follow OpenAI chat format; include a grounding system message when helpful.
- Example:
[
{"role": "system", "content": "You are a helpful assistant tasked with discovering mathematical function structures for scientific systems."},
{"role": "user", "content": "Modify the equation.py function, considering the physical meaning and relationships of the inputs."}
]
Evaluation
- We report FTFC (First‑Turn Functional Completeness), SR@R (Success Rate at R), and RC@R (Remaining Chances at R) with R=3.
- See the paper for experimental protocol and scores.
Limitations
- May produce incorrect tool arguments or overfit to frequent schemas
- Not safety‑filtered for sensitive domains; use with guardrails and oversight
License
- Inherits base model (GLM‑4.5) terms; verify upstream license before deployment
Citation
@article{LIMI2025,
title = {Less is More for Agency},
author = {LIMI Authors},
year = {2025},
journal = {arXiv preprint arXiv:2502.03387}
}