LIMI / README.md

mhjiang0408

Update README.md

5cae3bc verified 7 months ago

4.52 kB

tags:
  - text-generation
  - agent
  - tool-use
  - long-context
license: other
language:
  - en
pipeline_tag: text-generation

LIMI: Less is More for Agency

📌 Table of Contents

Overview
Model Details
Dataset
Quick Start
Prompting
Evaluation
Limitations
License
Citation

Overview

LIMI is an agentic model fine‑tuned from GLM‑4.5 using compact, high‑quality data to emphasize:

Targeted capabilities: tool use, multi‑turn correction, spec compliance
Long‑context trajectory with tokenizer‑filtered samples (≤128k tokens)
OpenAI‑style messages with optional function/tool calls

Model Details

Base model: zai-org/GLM-4.5
Context: up to 128k tokens (training budget)
Training framework: slime
Training data: curated conversations from GAIR/LIMI

Key Results

Model	Agency Bench FTFC	Agency Bench SR	Agency Bench RC	Training Samples
LIMI (Ours)	71.7	74.2	74.6	78
GLM-4.5	37.8	50.0	47.4	100k+

Model Zoo

Our LIMO model is available on Hugging Face 🤗:

Model	Backbone	Size	Link
LIMI	GLM‑4.5	355B	https://huggingface.co/GAIR/LIMI
LIMI‑Air	GLM‑4.5‑Air	106B	https://huggingface.co/GAIR/LIMI-Air

Datasets

We release our datasets through Hugging Face 🤗:

Name: GAIR/LIMI
Summary: curated agentic SFT data (OpenAI messages, optional tools, normalized tool‑call arguments), filtered by tokenizer to ≤128k tokens; current release contains ~78 high‑quality samples.
Link: https://huggingface.co/datasets/GAIR/LIMI

Quick Start

Start with HF Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "GAIR/LIMI", torch_dtype="auto", device_map="auto", trust_remote_code=True
)
tok = AutoTokenizer.from_pretrained("GAIR/LIMI", trust_remote_code=True)

messages = [
    {"role": "system", "content": "You are a helpful assistant tasked with discovering mathematical function structures for scientific systems."},
    {"role": "user", "content": "Modify the equation.py function, considering the physical meaning and relationships of the inputs."}
]

text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tok(text, return_tensors="pt").to(model.device)
out = model.generate(
    **inputs,
    max_new_tokens=4096,
    temperature=0.6,
    top_p=0.95,
    do_sample=True,
)
print(tok.decode(out[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True))

Start with VLLM

from vllm import LLM, SamplingParams
from transformers import AutoTokenizer

llm = LLM(model="GAIR/LIMI", trust_remote_code=True)
tok = AutoTokenizer.from_pretrained("GAIR/LIMI", trust_remote_code=True)
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
out = llm.generate(text, SamplingParams(temperature=0.6, max_tokens=4096, top_p=0.95))
print(out[0].outputs[0].text)

Prompting

Messages follow OpenAI chat format; include a grounding system message when helpful.
Example:

[
  {"role": "system", "content": "You are a helpful assistant tasked with discovering mathematical function structures for scientific systems."},
  {"role": "user", "content": "Modify the equation.py function, considering the physical meaning and relationships of the inputs."}
]

Evaluation

We report FTFC (First‑Turn Functional Completeness), SR@R (Success Rate at R), and RC@R (Remaining Chances at R) with R=3.
See the paper for experimental protocol and scores.

Limitations

May produce incorrect tool arguments or overfit to frequent schemas
Not safety‑filtered for sensitive domains; use with guardrails and oversight

License

Inherits base model (GLM‑4.5) terms; verify upstream license before deployment

Citation

@article{LIMI2025,
  title   = {Less is More for Agency},
  author  = {LIMI Authors},
  year    = {2025},
  journal = {arXiv preprint arXiv:2502.03387}
}