AyoubChLin/LFM2.5-1.2B-hermes-agent

Merged instruction model for agentic/function-calling behavior, fine-tuned from LiquidAI/LFM2.5-1.2B-Instruct on Hermes agent reasoning traces.

Model Details

  • Base model: LiquidAI/LFM2.5-1.2B-Instruct
  • Training method: Supervised Fine-Tuning (SFT) with LoRA, then merged into full weights
  • Intended repo: AyoubChLin/LFM2.5-1.2B-hermes-agent
  • Adapter used for merge: AyoubChLin/LFM2.5-1.2B-hermes-agent-lora
  • Frameworks: transformers, trl (SFTTrainer), peft, datasets

Training Data

Source dataset:

  • lambda/hermes-agent-reasoning-traces with config kimi

Data preprocessing in the training notebook:

  • Converted ShareGPT turns to OpenAI-style role messages (system, user, assistant, tool)
  • Normalized <tool_call>...</tool_call> blocks into LFM tool-call format:
    • <|tool_call_start|>[...]<|tool_call_end|>
  • Normalized <tool_response>...</tool_response> to raw tool role content
  • Rendered each sample to LFM2.5 ChatML with special tokens:
    • <|startoftext|>, <|im_start|>, <|im_end|>, <|endoftext|>
  • Filtered over-length samples with max sequence length 16,384

Dataset counts from the run:

  • Raw rows: 7,646
  • Kept after preprocessing: 4,987
  • Skipped as too long: 2,659
  • Malformed skipped: 0
  • Split: 4,887 train / 100 validation

Training Procedure

Main hyperparameters (from W&B run a6kutghd):

  • Epochs: 1
  • Max sequence length: 16,384
  • Packing: True
  • Per-device batch size: 4
  • Gradient accumulation: 4
  • Optimizer: adamw_8bit
  • Learning rate: 2e-4
  • LR scheduler: cosine
  • Warmup ratio: 0.03
  • Weight decay: 0.01
  • Max grad norm: 1.0
  • Precision: bf16=True, tf32=True
  • Gradient checkpointing: True (use_reentrant=False)
  • Seed: 42

LoRA setup:

  • r=32, alpha=64, dropout=0.05
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • modules_to_save: embed_tokens, lm_head

Hardware (logged by W&B metadata):

  • 1x NVIDIA B200
  • CUDA environment in Modal

Training Results

W&B run:

Final logged metrics:

  • train_loss: 0.4654
  • train/loss (last logged step): 0.3460
  • eval/loss: 0.3581
  • eval/mean_token_accuracy: 0.9141
  • train_runtime: 945.35s (~15.8 min)
  • train_samples_per_second: 3.42
  • train_steps_per_second: 0.215
  • eval/runtime: 6.44s
  • total_flos: 3.5557e17
  • train/global_step: 203

Note: W&B marks the run state as crashed, but training/evaluation metrics and model artifacts were logged and used for merge/push.

Intended Use

This model is intended for:

  • assistant-style chat
  • agent/tool-use workflows
  • function-calling style prompting and multi-turn reasoning traces

Not intended for:

  • safety-critical autonomous decisions
  • legal/medical/financial advice without human oversight

Prompt Format

Use the model chat template through the tokenizer when possible.

If formatting manually, training used LFM2.5 ChatML structure with special tokens and role blocks.

How to Use

Inference (Transformers)

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "AyoubChLin/LFM2.5-1.2B-hermes-agent"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

messages = [
    {"role": "user", "content": "List Python files in /workspace and count them."}
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)

with torch.no_grad():
    out = model.generate(
        inputs,
        max_new_tokens=512,
        temperature=0.6,
        top_k=50,
        repetition_penalty=1.05,
    )

print(tokenizer.decode(out[0], skip_special_tokens=True))

Limitations

  • Fine-tuned on one dataset config (kimi) and a single training epoch.
  • Strongly optimized for this trace style; may generalize unevenly to unrelated domains.
  • Tool-calling behavior quality depends on prompt/tool schema quality.

Reproducibility

Primary sources for this card:

  • Training notebook: lfm25-hermes-sft-a100-1.ipynb
  • W&B run: cherguelainea/lfm25-hermes-sft/a6kutghd
Downloads last month
10
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AyoubChLin/LFM2.5-1.2B-hermes-agent

Adapter
(23)
this model

Dataset used to train AyoubChLin/LFM2.5-1.2B-hermes-agent