base_model: unsloth/LFM2-350M-unsloth-bnb-4bit
library_name: peft
pipeline_tag: text-generation
tags:
- base_model:adapter:unsloth/LFM2-350M-unsloth-bnb-4bit
- lora
- qlora
- sft
- transformers
- trl
- conventional-commits
- code
lfm2_350m_commit_diff_summarizer (LoRA)
A lightweight helper model that turns Git diffs into Conventional Commit–style messages.
It outputs strict JSON with a short title (≤ 65 chars) and up to 3 bullets, so your CLI/agents can parse it deterministically.
Model Details
Model Description
Purpose: Summarize
git diffpatches into concise, Conventional Commit–compliant titles with optional bullets.I/O format:
- Input: prompt containing the diff (plain text).
- Output: JSON object:
{"title": "...", "bullets": ["...", "..."]}.
Developed by: Ethan (HF:
ethanke)Shared by: Ethan (HF:
ethanke)Model type: LoRA adapter for causal LM (text generation)
Language(s): English (commit message conventions)
License: Inherits base model’s license; dataset has non-commercial terms (see Training Data). Review before production/commercial use.
Finetuned from:
unsloth/LFM2-350M-unsloth-bnb-4bit(4-bit quantized base, trained with QLoRA)
Model Sources
- Repository: This model card + adapter on the Hub under
ethanke/lfm2_350m_commit_diff_summarizer
Uses
Direct Use
- Convert patch diffs into Conventional Commit messages for PR titles, commits, and changelogs.
- Provide human-readable summaries in agent UIs with guaranteed JSON structure.
Downstream Use
- Plug into CI to auto-suggest commit titles after tests pass.
- Use as a helper in a larger agent system (router/planner stays in a bigger model).
Out-of-Scope Use
- General code generation or deep refactoring explanations.
- Non-English commit conventions.
- Knowledge-intensive narrative summaries.
Bias, Risks, and Limitations
- Trained on public commits filtered to Conventional Commit titles; may prefer certain styles/projects.
- Long diffs are truncated to
max_length; summarization may miss edge changes. - Dataset license may restrict commercial usage; verify for your case.
Recommendations
- Enforce JSON validation; if invalid, retry with a JSON-repair prompt.
- Keep a regex gate for Conventional Commit titles in your pipeline.
How to Get Started
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch, json
BASE = "unsloth/LFM2-350M-unsloth-bnb-4bit"
ADAPTER = "ethanke/lfm2_350m_commit_diff_summarizer" # replace with your repo id
bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=torch.float16)
tok = AutoTokenizer.from_pretrained(BASE, use_fast=True)
mdl = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto")
mdl = PeftModel.from_pretrained(mdl, ADAPTER)
diff = "...your git diff text..."
prompt = (
"You are a commit message summarizer.\n"
"Return a concise JSON object with fields 'title' (<=65 chars) and 'bullets' (0-3 items).\n"
"Follow the Conventional Commit style for the title.\n\n"
"### DIFF\n" + diff + "\n\n### OUTPUT JSON\n"
)
inputs = tok(prompt, return_tensors="pt").to(mdl.device)
with torch.no_grad():
out = mdl.generate(**inputs, max_new_tokens=200, do_sample=False)
text = tok.decode(out[0], skip_special_tokens=True)
# naive JSON extraction
js = text[text.rfind("{"): text.rfind("}")+1]
obj = json.loads(js)
print(obj)
Training Details
Training Data
- Dataset:
Maxscha/commitbench(diff → commit message). - Filtering: kept only samples whose first non-empty line of the message matches Conventional Commits:
^(feat|fix|docs|style|refactor|perf|test|build|ci|chore|revert)(\([^)]+\))?(!)?:\s.+$ - Note: The dataset card indicates non-commercial licensing. Confirm before commercial deployment.
Training Procedure
Method: Supervised fine-tuning (SFT) with TRL
SFTTrainer+ QLoRA (PEFT).Prompting: Instruction +
### DIFF+### OUTPUT JSONtarget (title/bullets).Precision: fp16 compute on 4-bit base.
Hyperparameters (v0.1):
max_length=2048,per_device_train_batch_size=2,grad_accum=4lr=2e-4,scheduler=cosine,warmup_ratio=0.03epochs=1over capped subset- LoRA:
r=16,alpha=32,dropout=0.05, targets: q/k/v/o + MLP proj
Evaluation
Validation: filtered split from CommitBench.
Metrics (example run):
eval_loss ≈ 1.18→ perplexity ≈ 3.26eval_mean_token_accuracy ≈ 0.77- Suggested task metrics: JSON validity rate, CC-title compliance, title length ≤ 65 chars, bullets ≤ 3.
Environmental Impact
- Hardware: 1× NVIDIA GTX 3060 12 GB (local)
- Hours used: ~1–2 h (prototype)
Technical Specifications
- Architecture: LFM2-350M (decoder-only) + LoRA adapter
- Libraries:
transformers,trl,peft,bitsandbytes,datasets,unsloth
Citation
If you use this model, please cite the base model and dataset authors according to their cards.
Model Card Authors
- Ethan (
ethanke) and contributors
Contact
- Open an issue on the Hub repo or message
ethankeon Hugging Face.
Framework versions
- PEFT 0.17.1
- TRL (SFTTrainer)
- Transformers (recent version)