DistilBERT Resume NER (Token Classification)

This model is a fine-tuned version of a custom domain-adapted DistilBERT model. It is specifically designed to perform Named Entity Recognition (NER) on raw, unstructured resume text.

By leveraging a two-phase training strategy (Domain Adaptation via Section Classification, followed by a Token Classification head-swap), this model achieves highly accurate entity extraction across diverse global resume formats, messy Unstructured API text, and complex layout structures.

πŸ† Model Performance

This model achieves state-of-the-art level performance for resume entity extraction on the evaluation set:

  • F1 Score: 0.9699
  • Precision: 0.9678
  • Recall: 0.9719
  • Accuracy: 0.9876
  • Validation Loss: 0.0892

🧠 The Context-Aware Architecture

Unlike standard NER models, this model features Explicit Context Control. It was trained to accept text prefixed with structural section labels (e.g., [education], [experience], [skills]). By prepending these tags to your text chunks, you actively steer the model's attention mechanism, resulting in highly deterministic and accurate entity extraction.

🧠 Explicit Context Control (How it works)

This model optionally resolves ambiguity by looking at the section prefix. For example, the word "Java" could be a core competency or just a minor detail in a past project.

By prefixing your text, you force the model's behavior:

  • "[skills] Java, Python, C++" β†’ The model extracts these as SKILL entities with near 100% confidence.
  • "[experience] Built the backend using Java..." β†’ The model understands this is narrative text and correctly identifies "Java" in the context of a PROJECT or SKILL depending on your specific BIO mapping.
  • "[objective] ..." or "[contact] ..." β†’ The model learns to aggressively output O (Outside) tags for irrelevant narrative fluff, reducing false positives.

🏷️ Supported Context Prefixes & Output Labels

1. Context Control Prefixes (15 Classes)

Before passing text to the model, optionally prepend one of the following section tags in brackets (e.g., [experience]) if you want to strictly guide the model's extraction logic based on the structural context of the text:

Prefix Description & Examples
[contact] Emails, phone numbers, addresses, LinkedIn/GitHub URLs.
[summary] Professional summaries, profiles, or executive overviews.
[objective] Career objectives and personal statements.
[experience] Work history, NYSC, SIWES, internships.
[education] Degrees (BSc, HND, PhD), institutions, and grades.
[skills] Technical skills, soft skills, programming languages.
[certifications] Professional certs (AWS, ICAN, PMP), including "In View" status.
[projects] Personal or professional projects and open-source contributions.
[awards] Honors, scholarships, and Dean's Lists.
[hobbies] Interests, passions, and extracurricular activities.
[languages] Spoken languages and proficiency levels (e.g., Fluent, B2).
[volunteer] Community service and pro-bono work.
[publications] Research papers, articles, and academic journals.
[references] Referees or "References available upon request" statements.
[additional_info] Relocation willingness, visa status, clearance, etc.

2. Output NER Labels (37 BIO Tags)

The model extracts 18 core entities using the strict BIO (Beginning, Inside, Outside) format. This results in 37 unique classification tags (18 Entities Γ— 2 tags + 1 "O" tag).

Entity Category Core Entity Type BIO Tags Example
Experience JOB_TITLE B-JOB_TITLE, I-JOB_TITLE Senior Software Engineer
COMPANY B-COMPANY, I-COMPANY Google, Tech Solutions Ltd
LOCATION B-LOCATION, I-LOCATION Lagos, Nigeria; Remote
DATE_START B-DATE_START, I-DATE_START Jan 2020
DATE_END B-DATE_END, I-DATE_END Present, Aug 2023
BULLET_ACHIEVEMENT B-BULLET_ACHIEVEMENT, I-BULLET_ACHIEVEMENT Led a team of 5 engineers...
Education DEGREE B-DEGREE, I-DEGREE B.Sc., HND, PhD
FIELD_OF_STUDY B-FIELD_OF_STUDY, I-FIELD_OF_STUDY Computer Science
INSTITUTION B-INSTITUTION, I-INSTITUTION University of Lagos
YEAR B-YEAR, I-YEAR 2018, 2021
GPA B-GPA, I-GPA 3.8/4.0, First Class
Skills & Certs SKILL B-SKILL, I-SKILL Python, AWS, Agile
CERT_NAME B-CERT_NAME, I-CERT_NAME CCNA, PMP, Six Sigma
CERT_ISSUER B-CERT_ISSUER, I-CERT_ISSUER Udemy, Cisco, Amazon
Others AWARD_NAME B-AWARD_NAME, I-AWARD_NAME Rising Star Award
LANGUAGE B-LANGUAGE, I-LANGUAGE Igbo, English, Spanish
PROFICIENCY B-PROFICIENCY, I-PROFICIENCY Native, Fluent, Beginner
PROJECT_NAME B-PROJECT_NAME, I-PROJECT_NAME E-commerce Web App
Neutral OTHER O Text outside any entity

Full BIO Tag List (Exhaustive)

For programmatic access, the model's id2label mapping contains the following 37 indices: O, B-JOB_TITLE, I-JOB_TITLE, B-COMPANY, I-COMPANY, B-LOCATION, I-LOCATION, B-DATE_START, I-DATE_START, B-DATE_END, I-DATE_END, B-BULLET_ACHIEVEMENT, I-BULLET_ACHIEVEMENT, B-DEGREE, I-DEGREE, B-FIELD_OF_STUDY, I-FIELD_OF_STUDY, B-INSTITUTION, I-INSTITUTION, B-YEAR, I-YEAR, B-GPA, I-GPA, B-SKILL, I-SKILL, B-CERT_NAME, I-CERT_NAME, B-CERT_ISSUER, I-CERT_ISSUER, B-AWARD_NAME, I-AWARD_NAME, B-LANGUAGE, I-LANGUAGE, B-PROFICIENCY, I-PROFICIENCY, B-PROJECT_NAME, I-PROJECT_NAME.


πŸš€ Usage Tip: Handling the Prefix

When using the pipeline, the [context_label] prefix (e.g., [experience]) will be correctly classified as O (Outside). You should filter out any tokens matching your context prefixes or brackets from your final JSON extraction logic.

# Example filter logic
entities = ner_pipeline("[experience] Senior Developer at Google")
filtered_entities = [e for e in entities if e['word'] not in ['[experience]', '[', 'experience', ']']]

πŸš€ Usage

You can use this model in your application using the Hugging Face pipeline. To get the best results, optionally prepend the relevant context tag to the chunk of text you are parsing.

Python API

from transformers import pipeline

# 1. Initialize the NER pipeline
ner_pipeline = pipeline(
    "token-classification", 
    model="amosify/distilbert-resume-ner-v1",
    aggregation_strategy="simple"
)

# 2. Prepare your text WITH the context prefix
raw_text = "Graduated with a B.Sc. in Computer Science from the University of Lagos with a 3.8/4.0 GPA in August 2021."
context_aware_text = f"[education] {raw_text}"

# 3. Extract entities
entities = ner_pipeline(context_aware_text)

# 4. View Results
for entity in entities:
    # We ignore the "[education]" prefix token itself if it gets tagged
    if entity['word'].lower() not in ['[education]', '[', 'education', ']']:
        print(f"Entity: {entity['word']} | Label: {entity['entity_group']} | Confidence: {entity['score']:.4f}")

πŸ“Š Training Hyperparameters

The following hyperparameters were used during training:

  • Learning Rate: 2e-05
  • Train Batch Size: 32
  • Eval Batch Size: 32
  • Optimizer: AdamW (betas=(0.9,0.999), epsilon=1e-08)
  • LR Scheduler: linear
  • Epochs: 5
  • Weight Decay: 0.01

Training Log

Epoch Training Loss Validation Loss Precision Recall F1 Score Accuracy
1.0 No log 0.1482 0.9499 0.9693 0.9595 0.9850
2.0 0.6363 0.1107 0.9679 0.9605 0.9642 0.9864
3.0 0.6363 0.0934 0.9660 0.9745 0.9703 0.9882
4.0 0.0919 0.0889 0.9689 0.9713 0.9701 0.9876
5.0 0.0919 0.0892 0.9678 0.9719 0.9699 0.9876

βš™οΈ Framework Versions

  • Transformers 5.7.0
  • PyTorch 2.10.0+cu128
  • Datasets 4.8.5
  • Tokenizers 0.22.2

Downloads last month
97
Safetensors
Model size
66.4M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for amosify/distilbert-resume-ner-v1