DistilBERT Resume NER (Token Classification)

This model is a fine-tuned version of a custom domain-adapted DistilBERT model. It is specifically designed to perform Named Entity Recognition (NER) on raw, unstructured resume text.

By leveraging a two-phase training strategy (Domain Adaptation via Section Classification, followed by a Token Classification head-swap), this model achieves highly accurate entity extraction across diverse global resume formats, messy Unstructured API text, and complex layout structures.

🏆 Model Performance

This model achieves state-of-the-art level performance for resume entity extraction on the evaluation set:

F1 Score: 0.9699
Precision: 0.9678
Recall: 0.9719
Accuracy: 0.9876
Validation Loss: 0.0892

🧠 The Context-Aware Architecture

Unlike standard NER models, this model features Explicit Context Control. It was trained to accept text prefixed with structural section labels (e.g., [education], [experience], [skills]). By prepending these tags to your text chunks, you actively steer the model's attention mechanism, resulting in highly deterministic and accurate entity extraction.

🧠 Explicit Context Control (How it works)

This model optionally resolves ambiguity by looking at the section prefix. For example, the word "Java" could be a core competency or just a minor detail in a past project.

By prefixing your text, you force the model's behavior:

"[skills] Java, Python, C++" → The model extracts these as SKILL entities with near 100% confidence.
"[experience] Built the backend using Java..." → The model understands this is narrative text and correctly identifies "Java" in the context of a PROJECT or SKILL depending on your specific BIO mapping.
"[objective] ..." or "[contact] ..." → The model learns to aggressively output O (Outside) tags for irrelevant narrative fluff, reducing false positives.

🏷️ Supported Context Prefixes & Output Labels

1. Context Control Prefixes (15 Classes)

Before passing text to the model, optionally prepend one of the following section tags in brackets (e.g., [experience]) if you want to strictly guide the model's extraction logic based on the structural context of the text:

Prefix	Description & Examples
`[contact]`	Emails, phone numbers, addresses, LinkedIn/GitHub URLs.
`[summary]`	Professional summaries, profiles, or executive overviews.
`[objective]`	Career objectives and personal statements.
`[experience]`	Work history, NYSC, SIWES, internships.
`[education]`	Degrees (BSc, HND, PhD), institutions, and grades.
`[skills]`	Technical skills, soft skills, programming languages.
`[certifications]`	Professional certs (AWS, ICAN, PMP), including "In View" status.
`[projects]`	Personal or professional projects and open-source contributions.
`[awards]`	Honors, scholarships, and Dean's Lists.
`[hobbies]`	Interests, passions, and extracurricular activities.
`[languages]`	Spoken languages and proficiency levels (e.g., Fluent, B2).
`[volunteer]`	Community service and pro-bono work.
`[publications]`	Research papers, articles, and academic journals.
`[references]`	Referees or "References available upon request" statements.
`[additional_info]`	Relocation willingness, visa status, clearance, etc.

2. Output NER Labels (37 BIO Tags)

The model extracts 18 core entities using the strict BIO (Beginning, Inside, Outside) format. This results in 37 unique classification tags (18 Entities × 2 tags + 1 "O" tag).

Entity Category	Core Entity Type	BIO Tags	Example
Experience	`JOB_TITLE`	`B-JOB_TITLE`, `I-JOB_TITLE`	Senior Software Engineer
	`COMPANY`	`B-COMPANY`, `I-COMPANY`	Google, Tech Solutions Ltd
	`LOCATION`	`B-LOCATION`, `I-LOCATION`	Lagos, Nigeria; Remote
	`DATE_START`	`B-DATE_START`, `I-DATE_START`	Jan 2020
	`DATE_END`	`B-DATE_END`, `I-DATE_END`	Present, Aug 2023
	`BULLET_ACHIEVEMENT`	`B-BULLET_ACHIEVEMENT`, `I-BULLET_ACHIEVEMENT`	Led a team of 5 engineers...
Education	`DEGREE`	`B-DEGREE`, `I-DEGREE`	B.Sc., HND, PhD
	`FIELD_OF_STUDY`	`B-FIELD_OF_STUDY`, `I-FIELD_OF_STUDY`	Computer Science
	`INSTITUTION`	`B-INSTITUTION`, `I-INSTITUTION`	University of Lagos
	`YEAR`	`B-YEAR`, `I-YEAR`	2018, 2021
	`GPA`	`B-GPA`, `I-GPA`	3.8/4.0, First Class
Skills & Certs	`SKILL`	`B-SKILL`, `I-SKILL`	Python, AWS, Agile
	`CERT_NAME`	`B-CERT_NAME`, `I-CERT_NAME`	CCNA, PMP, Six Sigma
	`CERT_ISSUER`	`B-CERT_ISSUER`, `I-CERT_ISSUER`	Udemy, Cisco, Amazon
Others	`AWARD_NAME`	`B-AWARD_NAME`, `I-AWARD_NAME`	Rising Star Award
	`LANGUAGE`	`B-LANGUAGE`, `I-LANGUAGE`	Igbo, English, Spanish
	`PROFICIENCY`	`B-PROFICIENCY`, `I-PROFICIENCY`	Native, Fluent, Beginner
	`PROJECT_NAME`	`B-PROJECT_NAME`, `I-PROJECT_NAME`	E-commerce Web App
Neutral	`OTHER`	`O`	Text outside any entity

Full BIO Tag List (Exhaustive)

For programmatic access, the model's id2label mapping contains the following 37 indices: O, B-JOB_TITLE, I-JOB_TITLE, B-COMPANY, I-COMPANY, B-LOCATION, I-LOCATION, B-DATE_START, I-DATE_START, B-DATE_END, I-DATE_END, B-BULLET_ACHIEVEMENT, I-BULLET_ACHIEVEMENT, B-DEGREE, I-DEGREE, B-FIELD_OF_STUDY, I-FIELD_OF_STUDY, B-INSTITUTION, I-INSTITUTION, B-YEAR, I-YEAR, B-GPA, I-GPA, B-SKILL, I-SKILL, B-CERT_NAME, I-CERT_NAME, B-CERT_ISSUER, I-CERT_ISSUER, B-AWARD_NAME, I-AWARD_NAME, B-LANGUAGE, I-LANGUAGE, B-PROFICIENCY, I-PROFICIENCY, B-PROJECT_NAME, I-PROJECT_NAME.

🚀 Usage Tip: Handling the Prefix

When using the pipeline, the [context_label] prefix (e.g., [experience]) will be correctly classified as O (Outside). You should filter out any tokens matching your context prefixes or brackets from your final JSON extraction logic.

# Example filter logic
entities = ner_pipeline("[experience] Senior Developer at Google")
filtered_entities = [e for e in entities if e['word'] not in ['[experience]', '[', 'experience', ']']]

🚀 Usage

You can use this model in your application using the Hugging Face pipeline. To get the best results, optionally prepend the relevant context tag to the chunk of text you are parsing.

Python API

from transformers import pipeline

# 1. Initialize the NER pipeline
ner_pipeline = pipeline(
    "token-classification", 
    model="amosify/distilbert-resume-ner-v1",
    aggregation_strategy="simple"
)

# 2. Prepare your text WITH the context prefix
raw_text = "Graduated with a B.Sc. in Computer Science from the University of Lagos with a 3.8/4.0 GPA in August 2021."
context_aware_text = f"[education] {raw_text}"

# 3. Extract entities
entities = ner_pipeline(context_aware_text)

# 4. View Results
for entity in entities:
    # We ignore the "[education]" prefix token itself if it gets tagged
    if entity['word'].lower() not in ['[education]', '[', 'education', ']']:
        print(f"Entity: {entity['word']} | Label: {entity['entity_group']} | Confidence: {entity['score']:.4f}")

📊 Training Hyperparameters

The following hyperparameters were used during training:

Learning Rate: 2e-05
Train Batch Size: 32
Eval Batch Size: 32
Optimizer: AdamW (betas=(0.9,0.999), epsilon=1e-08)
LR Scheduler: linear
Epochs: 5
Weight Decay: 0.01

Training Log

Epoch	Training Loss	Validation Loss	Precision	Recall	F1 Score	Accuracy
1.0	No log	0.1482	0.9499	0.9693	0.9595	0.9850
2.0	0.6363	0.1107	0.9679	0.9605	0.9642	0.9864
3.0	0.6363	0.0934	0.9660	0.9745	0.9703	0.9882
4.0	0.0919	0.0889	0.9689	0.9713	0.9701	0.9876
5.0	0.0919	0.0892	0.9678	0.9719	0.9699	0.9876

⚙️ Framework Versions

Transformers 5.7.0
PyTorch 2.10.0+cu128
Datasets 4.8.5
Tokenizers 0.22.2

Downloads last month: 25

Safetensors

Model size

66.4M params

Tensor type

F32

Model tree for amosify/distilbert-resume-ner-v1

Base model

distilbert/distilbert-base-uncased

Finetuned

amosify/resume-section-classifier-v1

Finetuned

(1)

this model