AyoubChLin/LFM2.5-1.2B-hermes-agent

Merged instruction model for agentic/function-calling behavior, fine-tuned from LiquidAI/LFM2.5-1.2B-Instruct on Hermes agent reasoning traces.

Model Details

Base model: LiquidAI/LFM2.5-1.2B-Instruct
Training method: Supervised Fine-Tuning (SFT) with LoRA, then merged into full weights
Intended repo: AyoubChLin/LFM2.5-1.2B-hermes-agent
Adapter used for merge: AyoubChLin/LFM2.5-1.2B-hermes-agent-lora
Frameworks: transformers, trl (SFTTrainer), peft, datasets

Training Data

Source dataset:

lambda/hermes-agent-reasoning-traces with config kimi

Data preprocessing in the training notebook:

Converted ShareGPT turns to OpenAI-style role messages (system, user, assistant, tool)
Normalized <tool_call>...</tool_call> blocks into LFM tool-call format:
- <|tool_call_start|>[...]<|tool_call_end|>
Normalized <tool_response>...</tool_response> to raw tool role content
Rendered each sample to LFM2.5 ChatML with special tokens:
- <|startoftext|>, <|im_start|>, <|im_end|>, <|endoftext|>
Filtered over-length samples with max sequence length 16,384

Dataset counts from the run:

Raw rows: 7,646
Kept after preprocessing: 4,987
Skipped as too long: 2,659
Malformed skipped: 0
Split: 4,887 train / 100 validation

Training Procedure

Main hyperparameters (from W&B run a6kutghd):

Epochs: 1
Max sequence length: 16,384
Packing: True
Per-device batch size: 4
Gradient accumulation: 4
Optimizer: adamw_8bit
Learning rate: 2e-4
LR scheduler: cosine
Warmup ratio: 0.03
Weight decay: 0.01
Max grad norm: 1.0
Precision: bf16=True, tf32=True
Gradient checkpointing: True (use_reentrant=False)
Seed: 42

LoRA setup:

r=32, alpha=64, dropout=0.05
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
modules_to_save: embed_tokens, lm_head

Hardware (logged by W&B metadata):

1x NVIDIA B200
CUDA environment in Modal

Training Results

W&B run:

https://wandb.ai/cherguelainea/lfm25-hermes-sft/runs/a6kutghd

Final logged metrics:

train_loss: 0.4654
train/loss (last logged step): 0.3460
eval/loss: 0.3581
eval/mean_token_accuracy: 0.9141
train_runtime: 945.35s (~15.8 min)
train_samples_per_second: 3.42
train_steps_per_second: 0.215
eval/runtime: 6.44s
total_flos: 3.5557e17
train/global_step: 203

Note: W&B marks the run state as crashed, but training/evaluation metrics and model artifacts were logged and used for merge/push.

Intended Use

This model is intended for:

assistant-style chat
agent/tool-use workflows
function-calling style prompting and multi-turn reasoning traces

Not intended for:

safety-critical autonomous decisions
legal/medical/financial advice without human oversight

Prompt Format

Use the model chat template through the tokenizer when possible.

If formatting manually, training used LFM2.5 ChatML structure with special tokens and role blocks.

How to Use

Inference (Transformers)

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "AyoubChLin/LFM2.5-1.2B-hermes-agent"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

messages = [
    {"role": "user", "content": "List Python files in /workspace and count them."}
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)

with torch.no_grad():
    out = model.generate(
        inputs,
        max_new_tokens=512,
        temperature=0.6,
        top_k=50,
        repetition_penalty=1.05,
    )

print(tokenizer.decode(out[0], skip_special_tokens=True))

Limitations

Fine-tuned on one dataset config (kimi) and a single training epoch.
Strongly optimized for this trace style; may generalize unevenly to unrelated domains.
Tool-calling behavior quality depends on prompt/tool schema quality.

Reproducibility

Primary sources for this card:

Training notebook: lfm25-hermes-sft-a100-1.ipynb
W&B run: cherguelainea/lfm25-hermes-sft/a6kutghd

Downloads last month: 10

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for AyoubChLin/LFM2.5-1.2B-hermes-agent

Base model

LiquidAI/LFM2.5-1.2B-Base

Finetuned

LiquidAI/LFM2.5-1.2B-Instruct

Adapter

(23)

this model

AyoubChLin
/

LFM2.5-1.2B-hermes-agent