Text Classification
PEFT
Safetensors
English
entity-matching
person-name-matching
record-linkage
deduplication
lora
deberta-v3
Instructions to use LessLM/person-name-match-likelihood-v6 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use LessLM/person-name-match-likelihood-v6 with PEFT:
from peft import PeftModel from transformers import AutoModelForSequenceClassification base_model = AutoModelForSequenceClassification.from_pretrained("MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli") model = PeftModel.from_pretrained(base_model, "LessLM/person-name-match-likelihood-v6") - Notebooks
- Google Colab
- Kaggle
| license: mit | |
| base_model: MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli | |
| library_name: peft | |
| language: | |
| - en | |
| pipeline_tag: text-classification | |
| tags: | |
| - entity-matching | |
| - person-name-matching | |
| - record-linkage | |
| - deduplication | |
| - lora | |
| - deberta-v3 | |
| - peft | |
| metrics: | |
| - f1 | |
| - precision | |
| - recall | |
| - accuracy | |
| extra_gated_prompt: |- | |
| Acknowledge the intended use and limitations before downloading. | |
| extra_gated_fields: | |
| Name: text | |
| Affiliation: text | |
| Intended use (one sentence): text | |
| I have read the Bias, Risks, and Limitations section: checkbox | |
| # Person Name Match Likelihood (v6) | |
| > **Author:** Elad Laor · [LinkedIn](https://www.linkedin.com/in/elad-laor-1b1383250/) | |
| > | |
| > A scoring head over [`MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli`](https://huggingface.co/MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli) trained to predict whether two strings refer to the **same person**. Useful for record linkage, deduplication, and KYC-style identity matching where the only signal is a pair of name strings. | |
| ## Quick start | |
| ```python | |
| from peft import PeftModel | |
| from transformers import AutoModelForSequenceClassification, AutoTokenizer | |
| import torch | |
| BASE = "MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli" | |
| ADAPTER = "LessLM/person-name-match-likelihood-v6" | |
| tokenizer = AutoTokenizer.from_pretrained(BASE) | |
| base = AutoModelForSequenceClassification.from_pretrained(BASE, num_labels=2) | |
| model = PeftModel.from_pretrained(base, ADAPTER).eval() | |
| def score(name_a: str, name_b: str) -> float: | |
| """Return P(same person) in [0, 1].""" | |
| inputs = tokenizer(name_a, name_b, return_tensors="pt", truncation=True, max_length=128) | |
| with torch.no_grad(): | |
| logits = model(**inputs).logits | |
| return torch.softmax(logits, dim=-1)[0, 1].item() | |
| print(score("John A. Smith", "J. Smith")) # ~0.95 (initial expansion) | |
| print(score("Yitzhak Cohen", "Itzhak Cohen")) # ~0.99 (transliteration) | |
| print(score("John Smith", "John Smyth")) # ~0.90 (typo) | |
| print(score("Robert Adams", "Roberta Adams")) # ~0.05 (similar but different) | |
| ``` | |
| The model returns a 2-way softmax over `[no_match, match]`. The `match` probability is interpretable as a likelihood score; a temperature scaler (`calibration.pt` in this repo) is fitted on a held-out set if you want calibrated probabilities — load it and apply before the softmax for slightly tighter Expected Calibration Error. | |
| ## Headline metrics | |
| Evaluated on a held-out test set of **2,510 name pairs** drawn from real public entity data (OpenSanctions) and a curated synthetic edge-case set. | |
| | Metric | Score | | |
| |---|---| | |
| | F1 | **0.9682** | | |
| | Precision | 0.9568 | | |
| | Recall | 0.9798 | | |
| | Accuracy | 0.9733 | | |
| | Expected Calibration Error | 0.0162 | | |
| | Latency (p95, CPU) | **0.42 ms** | | |
| ### Performance by edge case | |
| | Edge case | Accuracy | n | | |
| |---|---|---| | |
| | `nickname` (Bob ↔ Robert) | **100.0%** | 121 | | |
| | `name_order` (Last, First ↔ First Last) | **100.0%** | 112 | | |
| | `transliteration` (Yitzhak ↔ Itzhak) | **100.0%** | 93 | | |
| | `initial` (J. Smith ↔ John Smith) | **100.0%** | 112 | | |
| | `middle_name` add/drop | **100.0%** | 50 | | |
| | `title_suffix` (Dr., Jr.) | **100.0%** | 112 | | |
| | `hyphenation`, `case_variation`, `combined`, `unrelated` | **100.0%** | 86 | | |
| | `tricky_non_match` (similar non-matches) | 97.9% | 331 | | |
| | `partial_overlap` | 97.7% | 353 | | |
| | `unknown` (real-world, no curated label) | 97.5% | 682 | | |
| | `similar_name` (Robert ↔ Roberta) | 95.0% | 341 | | |
| | `typo` | **84.6%** | 117 | | |
| The model is strongest on canonical edge cases (nicknames, initials, transliteration) and weakest on character-level typos where it overlaps with the `similar_name` distribution. | |
| ## How it was trained | |
| - **Base:** `MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli` (184M params, 12-layer encoder with disentangled attention). | |
| - **Adapter:** LoRA (rank=16, alpha=32, dropout=0.1), targeting `query_proj`, `key_proj`, `value_proj` in every attention block. ~600K trainable parameters (~0.3% of base). | |
| - **Loss:** Focal loss (γ=2.0) — down-weights easy examples, lets the model focus on hard pairs (similar names, typos). | |
| - **Optimizer:** AdamW, LR=2e-4, cosine schedule, 10% warmup, weight decay 0.01. | |
| - **Schedule:** 10 epochs, batch size 32, max sequence length 128 tokens, BF16 mixed precision. | |
| - **Seed:** 42. | |
| - **Data:** ~174K balanced match/no-match pairs (after 2.5× augmentation) — half drawn from OpenSanctions entities (real public records), half from a synthetic generator covering the 15 edge cases in the table above. Train/validation/calibration splits are entity-level (no person appears in more than one split) using a deterministic MD5-based hash so the splits reproduce bit-for-bit. | |
| ## Bias, Risks, and Limitations | |
| - **Latin script only.** The model was trained on Latin-script names. It will not work well on Hebrew, Arabic, Chinese, Cyrillic, etc. scripts unless they are first transliterated. | |
| - **OpenSanctions skew.** The real-world half of the training data is drawn from a public sanctions/PEP entity database. Names in that distribution skew toward political, business, and criminal figures, with heavy representation of Russian, Ukrainian, Iranian, Chinese, and Latin American transliterations and a long tail of titles and honorifics. The model may behave differently on, say, US consumer-database names than on this distribution. | |
| - **Pair-level only.** This is a *pairwise* matcher: given two name strings, score their likelihood of being the same person. It does not do blocking, clustering, or one-to-many matching. For dedup over a large list, pair it with a blocking layer (cheap pre-filter on first-letter, soundex, etc.) before invoking the model. | |
| - **Names alone.** No surrounding context (DOB, email, address). Two real-world people with the same name will score as a match. Use this as one signal among several in a real identity-matching pipeline, not as the sole decision. | |
| - **Typo accuracy is the weakest cell.** 84.6% on character-level typos. If your input is OCR output or hand-transcribed names, expect more errors in this category and consider a separate spell-correction step before scoring. | |
| - **No production guarantees.** This is a research/portfolio artifact. Performance on your distribution may differ. Evaluate on a sample of your own data before relying on it. | |
| ## Intended use | |
| - Record linkage and dedup of person-name fields in datasets where you have only name strings to work with. | |
| - KYC and identity-matching workflows as one feature among several. | |
| - Benchmarking and research on encoder-based entity matching. | |
| ## Out-of-scope use | |
| - Non-Latin scripts (Hebrew, Arabic, Chinese, etc.) without prior transliteration. | |
| - Surveillance, social scoring, or any use that would single out individuals for adverse treatment based on a name-match score alone. | |
| - High-stakes one-shot identity decisions (eligibility, denial, arrest, eviction) — the model gives a likelihood, not a verdict. | |
| ## License | |
| [MIT](https://opensource.org/license/mit/). You are free to use, modify, and redistribute, including commercially, provided you keep the attribution and license notice. | |
| ## Citation | |
| If you use this model, a backlink to this repo or the author's profile is appreciated. | |
| ```bibtex | |
| @misc{laor2026_person_name_match_v6, | |
| author = {Elad Laor}, | |
| title = {Person Name Match Likelihood (v6) — a LoRA adapter on DeBERTa-v3 for pairwise person-name matching}, | |
| year = {2026}, | |
| url = {https://huggingface.co/LessLM/person-name-match-likelihood-v6} | |
| } | |
| ``` | |
| ## Reproducibility & technical details | |
| - **Framework versions:** `peft==0.18.1`, `transformers>=4.40`, `torch>=2.0`. | |
| - **Training environment:** RunPod RTX 4090, ~3h wall-clock, BF16. Original v6 trained on RTX 3090 Ti (cross-GPU F1 delta: −0.0025). | |
| - **Seed:** All randomness controlled by seed=42 (numpy, torch, transformers, dataloader generators). Re-running the training script with this seed and dataset version produces F1 within ±0.005 across BF16-capable GPUs. | |
| - **Calibration:** `calibration.pt` is a single-parameter temperature scaler (T=0.95) fitted on a 1.5K held-out set. Apply it to logits before the final softmax to slightly reduce Expected Calibration Error from 0.016 to ~0.012. | |