LFM2.5-1.2B-MOAT
Multi-task Optimized Assessment Tool — a finetuned LiquidAI/LFM2.5-1.2B-Instruct model for recruitment AI.
Handles two tasks with a single model:
- CV-JD Assessment — Match scoring + qualitative analysis
- Keyword Extraction — Structured keyword extraction from job descriptions and CVs
Training
- Base model: LiquidAI/LFM2.5-1.2B-Instruct (1.2B params, hybrid Mamba2 + Attention)
- Stage 1 — Multi-task SFT: 39,641 examples (19,588 assessments + 20,053 keywords), LoRA r=32/α=64, 1 epoch, LR=5e-5
- Stage 2 — Targeted DPO: 2,374 filtered problematic pairs (|score diff| ≥ 5pts), LoRA r=16/α=32, beta=0.2, LR=5e-6
- Hardware: NVIDIA RTX 5080 16GB, total training time ~3.5 hours
- Training data: Gemini-generated assessments and keyword extractions across tech, healthcare, finance, and blue collar domains
Performance
CV-JD Assessment (4,898 held-out samples)
| Metric | V1 Baseline | MOAT V2 | Target |
|---|---|---|---|
| JSON Parse Rate | 97.0% | 99.9% | ≥95% |
| Score MAE | 13.1 pts | 6.82 pts | <8 |
| Score Bias | -13.0 pts | +1.53 pts | ~0 |
| Verdict Accuracy | 50.0% | 76.8% | >60% |
| Within 5 pts | — | 51.4% | — |
| Within 10 pts | — | 77.5% | — |
| Median Absolute Error | — | 4.90 pts | — |
Keyword Extraction (10 diverse samples across domains)
| Field | Accuracy |
|---|---|
| JSON Parse Rate | 100% |
| Schema Complete | 100% |
| Experience Years | 100% |
| Domain | 90% |
| Education | 80% |
| Seniority | 80% |
| Skills (avg F1) | 0.58 |
Skills F1 varies by domain: white collar (0.74-0.84) > blue collar/healthcare (0.33-0.58). The model extracts correct skills but sometimes at different granularity than reference labels.
Usage with vLLM
from vllm import LLM, SamplingParams
model = LLM(
model="GazTrab/LFM2.5-1.2B-MOAT",
max_model_len=4096,
gpu_memory_utilization=0.85,
dtype="bfloat16",
trust_remote_code=True,
max_num_seqs=64,
)
tokenizer = model.get_tokenizer()
sampling_params = SamplingParams(
temperature=0.1,
top_p=0.1,
top_k=50,
repetition_penalty=1.05,
max_tokens=2048,
)
# Build prompt using chat template
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": USER_PROMPT},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = model.generate([prompt], sampling_params)
print(outputs[0].outputs[0].text)
Important Notes
- max_model_len=4096 — the model was trained with this context length
- temperature=0.1, top_p=0.1 — low temperature for consistent structured output
- trust_remote_code=True — required for the LFM2.5 architecture (hybrid Mamba2 + Attention)
- Prompts exceeding ~2048 tokens should be truncated (leave room for generation)
- The model outputs raw JSON — no markdown fences needed
Usage with Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "GazTrab/LFM2.5-1.2B-MOAT"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": USER_PROMPT},
]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
output = model.generate(
input_ids,
max_new_tokens=2048,
temperature=0.1,
top_p=0.1,
top_k=50,
repetition_penalty=1.05,
do_sample=True,
)
response = tokenizer.decode(output[0][input_ids.shape[1]:], skip_special_tokens=True)
print(response)
Task Prompts
Task 1: CV-JD Assessment
System prompt:
You are an expert recruitment AI that analyzes CV-JD compatibility.
You MUST respond with valid JSON only. No additional text before or after the JSON.
Output schema:
{
"match_score": <float 0-100>,
"executive_summary": "<2-3 sentence overview>",
"strengths": ["<quantified strength 1>", "<quantified strength 2>", ...],
"gaps": ["<specific gap 1>", "<specific gap 2>", ...],
"recommendation": "Interview|Consider|Not recommended",
"verdict": "STRONG_MATCH|GOOD_MATCH|MODERATE_MATCH|WEAK_MATCH|NOT_SUITABLE"
}
Guidelines:
- Be specific and quantified in strengths/gaps (e.g., "5/7 required skills", "3 years below requirement")
- Reference actual skills from the JD and CV
- Verdict must align with match_score brackets
- Keep strengths and gaps to 2-4 items each
User prompt format:
Analyze the following CV against the Job Description and provide a structured assessment.
=== JOB DESCRIPTION ===
{jd_text}
=== CANDIDATE CV ===
{cv_text}
Respond with JSON only:
Verdict-to-score mapping:
| Verdict | Score Range |
|---|---|
| STRONG_MATCH | 85-100 |
| GOOD_MATCH | 70-84 |
| MODERATE_MATCH | 50-69 |
| WEAK_MATCH | 30-49 |
| NOT_SUITABLE | 0-29 |
Task 2: Keyword Extraction
System prompt:
You are an expert recruitment AI that extracts structured keywords from documents.
You MUST respond with valid JSON only. No additional text before or after the JSON.
Output schema:
{
"skills": ["<skill 1>", "<skill 2>", ...],
"experience_years": <integer>,
"education": "<phd|master|bachelor|associate|diploma|certificate|high_school|none>",
"certifications": ["<cert 1>", "<cert 2>", ...],
"domain": "<2-4 word domain>",
"seniority": "<intern|junior|mid|senior|lead|principal|director|manager>"
}
Guidelines:
- Extract only explicitly stated skills, not inferred ones
- For CVs: infer experience_years from work history dates
- For JDs: use the stated requirement, or 0 if not specified
- Skills should be lowercase
- Keep domain to 2-4 words
User prompt format (for JDs):
Extract structured keywords from the following Job Description.
=== JOB DESCRIPTION ===
{jd_text}
Respond with JSON only:
User prompt format (for CVs):
Extract structured keywords from the following CV/Resume.
=== CANDIDATE CV ===
{cv_text}
Respond with JSON only:
Limitations
- Low-score bias: Scores in the 0-20 range tend to be overestimated by ~8 points (model struggles to score below ~17)
- Blue collar granularity: Keyword extraction for trade/blue collar roles sometimes outputs overly verbose skill descriptions
- Training data domains: Primarily trained on tech, healthcare, and finance — generalizes to other domains but with slightly lower quality
- Context length: Long CVs or JDs may need truncation to stay within the 2048-token prompt budget
Citation
@misc{gaztrab2026moat,
title={LFM2.5-1.2B-MOAT: Multi-task Optimized Assessment Tool for Recruitment},
author={GazTrab},
year={2026},
url={https://huggingface.co/GazTrab/LFM2.5-1.2B-MOAT}
}
- Downloads last month
- 600
Model tree for GazTrab/LFM2.5-1.2B-MOAT
Evaluation results
- Score MAEself-reported6.820
- JSON Parse Rateself-reported99.900
- Verdict Accuracyself-reported76.800
- Score Biasself-reported1.530