Resume NER BERT v2 (ONNX)
Pre-exported ONNX version of yashpwr/resume-ner-bert-v2 for direct use without Python/PyTorch dependencies.
Converted using Optimum for use in lucidRESUME, a local-first resume analysis desktop app built with .NET and ONNX Runtime.
Model Details
| Original model | yashpwr/resume-ner-bert-v2 |
| Architecture | BertForTokenClassification (bert-base-cased) |
| Parameters | 107.7M |
| Task | Token classification (NER, BIO scheme) |
| License | Apache 2.0 |
| Export tool | optimum.exporters.onnx (Optimum + Transformers 4.57) |
Performance
| Metric | Score |
|---|---|
| F1 | 90.87% |
| Precision | 91.44% |
| Recall | 90.81% |
Entity Types
The model recognises 12 entity types using BIO tagging (25 labels total):
| Entity | Description |
|---|---|
| Name | Person's full name |
| Email Address | Email contact |
| Phone | Phone number |
| Location | Geographic location |
| Companies worked at | Previous employers |
| Designation | Job titles / roles |
| Skills | Technical and soft skills |
| Years of Experience | Work duration |
| Degree | Educational qualifications |
| College Name | Educational institutions |
| Graduation Year | Year of degree completion |
| UNKNOWN | Unclassified entities |
Usage
With ONNX Runtime (.NET)
This model is used by lucidRESUME for resume entity extraction via ONNX Runtime in C#. The app downloads this model automatically on first launch -- no Python required.
With ONNX Runtime (Python)
from optimum.onnxruntime import ORTModelForTokenClassification
from transformers import AutoTokenizer, pipeline
tokenizer = AutoTokenizer.from_pretrained("scottgal/resume-ner-bert-v2-onnx")
model = ORTModelForTokenClassification.from_pretrained("scottgal/resume-ner-bert-v2-onnx")
ner = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple")
results = ner("John Smith, Software Engineer at Google with 5 years of experience. BSc Computer Science from MIT.")
for entity in results:
print(f"{entity['entity_group']}: {entity['word']} ({entity['score']:.2f})")
Training Data
The original model was trained on 22,542 samples:
- Resume-Corpus Dataset -- 349 samples
- DataTurks Resume NER -- 420 samples
- Custom Training Data -- 21,773 samples (rule-based extraction)
- Mehyaar Skills Dataset -- skills-focused data
See the original model card for full training details.
Limitations
- English only
- Best with text-based resumes (not scanned images)
- Primarily trained on technology and business resumes
- Optimal for resumes under 512 tokens (trained with max sequence length of 128)
Links
- Original model: yashpwr/resume-ner-bert-v2
- lucidRESUME: github.com/scottgal/lucidRESUME -- local-first desktop app for resume analysis, job matching, and career planning
- Conversion tool: Hugging Face Optimum
- Downloads last month
- 19
Model tree for scottgal/resume-ner-bert-v2-onnx
Base model
yashpwr/resume-ner-bert-v2