LIMI / README.md
mhjiang0408's picture
Update README.md
5cae3bc verified
|
raw
history blame
4.52 kB
metadata
tags:
  - text-generation
  - agent
  - tool-use
  - long-context
license: other
language:
  - en
pipeline_tag: text-generation

LIMI: Less is More for Agency

📌 Table of Contents

Overview

LIMI is an agentic model fine‑tuned from GLM‑4.5 using compact, high‑quality data to emphasize:

  • Targeted capabilities: tool use, multi‑turn correction, spec compliance
  • Long‑context trajectory with tokenizer‑filtered samples (≤128k tokens)
  • OpenAI‑style messages with optional function/tool calls

Model Details

  • Base model: zai-org/GLM-4.5
  • Context: up to 128k tokens (training budget)
  • Training framework: slime
  • Training data: curated conversations from GAIR/LIMI

Key Results

Model Agency Bench FTFC Agency Bench SR Agency Bench RC Training Samples
LIMI (Ours) 71.7 74.2 74.6 78
GLM-4.5 37.8 50.0 47.4 100k+

Model Zoo

Our LIMO model is available on Hugging Face 🤗:

Datasets

We release our datasets through Hugging Face 🤗:

  • Name: GAIR/LIMI
  • Summary: curated agentic SFT data (OpenAI messages, optional tools, normalized tool‑call arguments), filtered by tokenizer to ≤128k tokens; current release contains ~78 high‑quality samples.
  • Link: https://huggingface.co/datasets/GAIR/LIMI

Quick Start

Start with HF Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "GAIR/LIMI", torch_dtype="auto", device_map="auto", trust_remote_code=True
)
tok = AutoTokenizer.from_pretrained("GAIR/LIMI", trust_remote_code=True)

messages = [
    {"role": "system", "content": "You are a helpful assistant tasked with discovering mathematical function structures for scientific systems."},
    {"role": "user", "content": "Modify the equation.py function, considering the physical meaning and relationships of the inputs."}
]

text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tok(text, return_tensors="pt").to(model.device)
out = model.generate(
    **inputs,
    max_new_tokens=4096,
    temperature=0.6,
    top_p=0.95,
    do_sample=True,
)
print(tok.decode(out[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True))
Start with VLLM
from vllm import LLM, SamplingParams
from transformers import AutoTokenizer

llm = LLM(model="GAIR/LIMI", trust_remote_code=True)
tok = AutoTokenizer.from_pretrained("GAIR/LIMI", trust_remote_code=True)
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
out = llm.generate(text, SamplingParams(temperature=0.6, max_tokens=4096, top_p=0.95))
print(out[0].outputs[0].text)

Prompting

  • Messages follow OpenAI chat format; include a grounding system message when helpful.
  • Example:
[
  {"role": "system", "content": "You are a helpful assistant tasked with discovering mathematical function structures for scientific systems."},
  {"role": "user", "content": "Modify the equation.py function, considering the physical meaning and relationships of the inputs."}
]

Evaluation

  • We report FTFC (First‑Turn Functional Completeness), SR@R (Success Rate at R), and RC@R (Remaining Chances at R) with R=3.
  • See the paper for experimental protocol and scores.

Limitations

  • May produce incorrect tool arguments or overfit to frequent schemas
  • Not safety‑filtered for sensitive domains; use with guardrails and oversight

License

  • Inherits base model (GLM‑4.5) terms; verify upstream license before deployment

Citation

@article{LIMI2025,
  title   = {Less is More for Agency},
  author  = {LIMI Authors},
  year    = {2025},
  journal = {arXiv preprint arXiv:2502.03387}
}