LFM2.5-1.2B-MOAT

Multi-task Optimized Assessment Tool — a finetuned LiquidAI/LFM2.5-1.2B-Instruct model for recruitment AI.

Handles two tasks with a single model:

  1. CV-JD Assessment — Match scoring + qualitative analysis
  2. Keyword Extraction — Structured keyword extraction from job descriptions and CVs

Training

  • Base model: LiquidAI/LFM2.5-1.2B-Instruct (1.2B params, hybrid Mamba2 + Attention)
  • Stage 1 — Multi-task SFT: 39,641 examples (19,588 assessments + 20,053 keywords), LoRA r=32/α=64, 1 epoch, LR=5e-5
  • Stage 2 — Targeted DPO: 2,374 filtered problematic pairs (|score diff| ≥ 5pts), LoRA r=16/α=32, beta=0.2, LR=5e-6
  • Hardware: NVIDIA RTX 5080 16GB, total training time ~3.5 hours
  • Training data: Gemini-generated assessments and keyword extractions across tech, healthcare, finance, and blue collar domains

Performance

CV-JD Assessment (4,898 held-out samples)

Metric V1 Baseline MOAT V2 Target
JSON Parse Rate 97.0% 99.9% ≥95%
Score MAE 13.1 pts 6.82 pts <8
Score Bias -13.0 pts +1.53 pts ~0
Verdict Accuracy 50.0% 76.8% >60%
Within 5 pts — 51.4% —
Within 10 pts — 77.5% —
Median Absolute Error — 4.90 pts —

Keyword Extraction (10 diverse samples across domains)

Field Accuracy
JSON Parse Rate 100%
Schema Complete 100%
Experience Years 100%
Domain 90%
Education 80%
Seniority 80%
Skills (avg F1) 0.58

Skills F1 varies by domain: white collar (0.74-0.84) > blue collar/healthcare (0.33-0.58). The model extracts correct skills but sometimes at different granularity than reference labels.

Usage with vLLM

from vllm import LLM, SamplingParams

model = LLM(
    model="GazTrab/LFM2.5-1.2B-MOAT",
    max_model_len=4096,
    gpu_memory_utilization=0.85,
    dtype="bfloat16",
    trust_remote_code=True,
    max_num_seqs=64,
)
tokenizer = model.get_tokenizer()

sampling_params = SamplingParams(
    temperature=0.1,
    top_p=0.1,
    top_k=50,
    repetition_penalty=1.05,
    max_tokens=2048,
)

# Build prompt using chat template
messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": USER_PROMPT},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

outputs = model.generate([prompt], sampling_params)
print(outputs[0].outputs[0].text)

Important Notes

  • max_model_len=4096 — the model was trained with this context length
  • temperature=0.1, top_p=0.1 — low temperature for consistent structured output
  • trust_remote_code=True — required for the LFM2.5 architecture (hybrid Mamba2 + Attention)
  • Prompts exceeding ~2048 tokens should be truncated (leave room for generation)
  • The model outputs raw JSON — no markdown fences needed

Usage with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "GazTrab/LFM2.5-1.2B-MOAT"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": USER_PROMPT},
]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)

output = model.generate(
    input_ids,
    max_new_tokens=2048,
    temperature=0.1,
    top_p=0.1,
    top_k=50,
    repetition_penalty=1.05,
    do_sample=True,
)
response = tokenizer.decode(output[0][input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Task Prompts

Task 1: CV-JD Assessment

System prompt:

You are an expert recruitment AI that analyzes CV-JD compatibility.
You MUST respond with valid JSON only. No additional text before or after the JSON.

Output schema:
{
  "match_score": <float 0-100>,
  "executive_summary": "<2-3 sentence overview>",
  "strengths": ["<quantified strength 1>", "<quantified strength 2>", ...],
  "gaps": ["<specific gap 1>", "<specific gap 2>", ...],
  "recommendation": "Interview|Consider|Not recommended",
  "verdict": "STRONG_MATCH|GOOD_MATCH|MODERATE_MATCH|WEAK_MATCH|NOT_SUITABLE"
}

Guidelines:
- Be specific and quantified in strengths/gaps (e.g., "5/7 required skills", "3 years below requirement")
- Reference actual skills from the JD and CV
- Verdict must align with match_score brackets
- Keep strengths and gaps to 2-4 items each

User prompt format:

Analyze the following CV against the Job Description and provide a structured assessment.

=== JOB DESCRIPTION ===
{jd_text}

=== CANDIDATE CV ===
{cv_text}

Respond with JSON only:

Verdict-to-score mapping:

Verdict Score Range
STRONG_MATCH 85-100
GOOD_MATCH 70-84
MODERATE_MATCH 50-69
WEAK_MATCH 30-49
NOT_SUITABLE 0-29

Task 2: Keyword Extraction

System prompt:

You are an expert recruitment AI that extracts structured keywords from documents.
You MUST respond with valid JSON only. No additional text before or after the JSON.

Output schema:
{
  "skills": ["<skill 1>", "<skill 2>", ...],
  "experience_years": <integer>,
  "education": "<phd|master|bachelor|associate|diploma|certificate|high_school|none>",
  "certifications": ["<cert 1>", "<cert 2>", ...],
  "domain": "<2-4 word domain>",
  "seniority": "<intern|junior|mid|senior|lead|principal|director|manager>"
}

Guidelines:
- Extract only explicitly stated skills, not inferred ones
- For CVs: infer experience_years from work history dates
- For JDs: use the stated requirement, or 0 if not specified
- Skills should be lowercase
- Keep domain to 2-4 words

User prompt format (for JDs):

Extract structured keywords from the following Job Description.

=== JOB DESCRIPTION ===
{jd_text}

Respond with JSON only:

User prompt format (for CVs):

Extract structured keywords from the following CV/Resume.

=== CANDIDATE CV ===
{cv_text}

Respond with JSON only:

Limitations

  • Low-score bias: Scores in the 0-20 range tend to be overestimated by ~8 points (model struggles to score below ~17)
  • Blue collar granularity: Keyword extraction for trade/blue collar roles sometimes outputs overly verbose skill descriptions
  • Training data domains: Primarily trained on tech, healthcare, and finance — generalizes to other domains but with slightly lower quality
  • Context length: Long CVs or JDs may need truncation to stay within the 2048-token prompt budget

Citation

@misc{gaztrab2026moat,
  title={LFM2.5-1.2B-MOAT: Multi-task Optimized Assessment Tool for Recruitment},
  author={GazTrab},
  year={2026},
  url={https://huggingface.co/GazTrab/LFM2.5-1.2B-MOAT}
}
Downloads last month
600
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GazTrab/LFM2.5-1.2B-MOAT

Adapter
(21)
this model
Adapters
2 models

Evaluation results