AyoubChLin/LFM2.5-1.2B-hermes-agent
Merged instruction model for agentic/function-calling behavior, fine-tuned from LiquidAI/LFM2.5-1.2B-Instruct on Hermes agent reasoning traces.
Model Details
- Base model:
LiquidAI/LFM2.5-1.2B-Instruct - Training method: Supervised Fine-Tuning (SFT) with LoRA, then merged into full weights
- Intended repo:
AyoubChLin/LFM2.5-1.2B-hermes-agent - Adapter used for merge:
AyoubChLin/LFM2.5-1.2B-hermes-agent-lora - Frameworks:
transformers,trl(SFTTrainer),peft,datasets
Training Data
Source dataset:
lambda/hermes-agent-reasoning-traceswith configkimi
Data preprocessing in the training notebook:
- Converted ShareGPT turns to OpenAI-style role messages (
system,user,assistant,tool) - Normalized
<tool_call>...</tool_call>blocks into LFM tool-call format:<|tool_call_start|>[...]<|tool_call_end|>
- Normalized
<tool_response>...</tool_response>to rawtoolrole content - Rendered each sample to LFM2.5 ChatML with special tokens:
<|startoftext|>,<|im_start|>,<|im_end|>,<|endoftext|>
- Filtered over-length samples with max sequence length
16,384
Dataset counts from the run:
- Raw rows:
7,646 - Kept after preprocessing:
4,987 - Skipped as too long:
2,659 - Malformed skipped:
0 - Split:
4,887train /100validation
Training Procedure
Main hyperparameters (from W&B run a6kutghd):
- Epochs:
1 - Max sequence length:
16,384 - Packing:
True - Per-device batch size:
4 - Gradient accumulation:
4 - Optimizer:
adamw_8bit - Learning rate:
2e-4 - LR scheduler:
cosine - Warmup ratio:
0.03 - Weight decay:
0.01 - Max grad norm:
1.0 - Precision:
bf16=True,tf32=True - Gradient checkpointing:
True(use_reentrant=False) - Seed:
42
LoRA setup:
r=32,alpha=64,dropout=0.05- Target modules:
q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj modules_to_save:embed_tokens,lm_head
Hardware (logged by W&B metadata):
- 1x
NVIDIA B200 - CUDA environment in Modal
Training Results
W&B run:
Final logged metrics:
train_loss:0.4654train/loss(last logged step):0.3460eval/loss:0.3581eval/mean_token_accuracy:0.9141train_runtime:945.35s(~15.8 min)train_samples_per_second:3.42train_steps_per_second:0.215eval/runtime:6.44stotal_flos:3.5557e17train/global_step:203
Note: W&B marks the run state as crashed, but training/evaluation metrics and model artifacts were logged and used for merge/push.
Intended Use
This model is intended for:
- assistant-style chat
- agent/tool-use workflows
- function-calling style prompting and multi-turn reasoning traces
Not intended for:
- safety-critical autonomous decisions
- legal/medical/financial advice without human oversight
Prompt Format
Use the model chat template through the tokenizer when possible.
If formatting manually, training used LFM2.5 ChatML structure with special tokens and role blocks.
How to Use
Inference (Transformers)
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "AyoubChLin/LFM2.5-1.2B-hermes-agent"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
messages = [
{"role": "user", "content": "List Python files in /workspace and count them."}
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt",
).to(model.device)
with torch.no_grad():
out = model.generate(
inputs,
max_new_tokens=512,
temperature=0.6,
top_k=50,
repetition_penalty=1.05,
)
print(tokenizer.decode(out[0], skip_special_tokens=True))
Limitations
- Fine-tuned on one dataset config (
kimi) and a single training epoch. - Strongly optimized for this trace style; may generalize unevenly to unrelated domains.
- Tool-calling behavior quality depends on prompt/tool schema quality.
Reproducibility
Primary sources for this card:
- Training notebook:
lfm25-hermes-sft-a100-1.ipynb - W&B run:
cherguelainea/lfm25-hermes-sft/a6kutghd
- Downloads last month
- 10
Model tree for AyoubChLin/LFM2.5-1.2B-hermes-agent
Base model
LiquidAI/LFM2.5-1.2B-Base Finetuned
LiquidAI/LFM2.5-1.2B-Instruct