TinyBERT Address Autofill
A compact field-type classifier for HTML form autofill developed by the
Credentials Management Team on Firefox. Given a string describing a single form
field's attributes, it predicts one of 66 autofill field types (given-name,
family-name, email, postal-code, address-line1, cc-number, etc.) or
other when the field should not be filled.
The model is fine-tuned from huawei-noah/TinyBERT_General_4L_312D on a
corpus of manually annotated shopping and address forms collected by Mozilla, and is
intended to run client-side inside Firefox (or any Transformers.js host) as
a replacement or augmentation for the existing regex-based heuristic field
detector.
ONNX variants
All variants live under onnx/ and are loadable through Transformers.js by
passing the corresponding dtype argument.
| File | Precision | Size | Transformers.js dtype |
|---|---|---|---|
onnx/model.onnx |
fp32 | 57.6 MB | fp32 |
onnx/model_fp16.onnx |
fp16 | 28.9 MB | fp16 |
onnx/model_quantized.onnx |
int8 dynamic (default) | 14.6 MB | q8 |
onnx/model_int8.onnx |
int8 dynamic | 14.6 MB | int8 |
onnx/model_uint8.onnx |
uint8 dynamic | 14.6 MB | uint8 |
onnx/model_q4.onnx |
4-bit weight-only on MatMul | 42.3 MB | q4 |
onnx/model_q4f16.onnx |
4-bit on top of fp16 | 22.4 MB | q4f16 |
onnx/model_bnb4.onnx |
bitsandbytes NF4 | 41.9 MB | bnb4 |
How to use
Transformers.js (browser)
import { pipeline } from "@huggingface/transformers";
const classifier = await pipeline(
"text-classification",
"vazish/tinybert-address-autofill",
{ dtype: "q8" } // try "fp16" for highest fidelity, "q4f16" for smallest
);
const out = await classifier(
"a-c-postal-code billing zip code dwfrm billing address fields postal code"
);
// β [{ label: "postal-code", score: 0.99 }]
Python (Optimum + ONNX Runtime)
from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer, pipeline
model = ORTModelForSequenceClassification.from_pretrained(
"vazish/tinybert-address-autofill",
file_name="onnx/model.onnx", # or onnx/model_quantized.onnx, etc.
)
tokenizer = AutoTokenizer.from_pretrained("vazish/tinybert-address-autofill")
clf = pipeline("text-classification", model=model, tokenizer=tokenizer)
clf("email email mail **email")
# β [{"label": "email", "score": 0.99}]
Input format
The model expects a single string per field, built by concatenating that field's HTML attributes after light normalisation:
- Concatenate (in order):
type+autocomplete+id+name+placeholder+ the field's computed<label>text. - Split camelCase boundaries to whitespace (
firstNameβfirst name). - Lowercase the whole thing.
- If the field declares an
autocompleteattribute, prepend ana-c-<value>token (e.g.a-c-postal-code). - Optionally include adjacent-field context β
bb-prefixed tokens for the previous field on the same form andaa-prefixed tokens for the next. Including adjacent context improves accuracy by roughly 8 percentage points relative to the same model trained on isolated fields.
Example input for a "first name" field followed by a "last name" field:
first name first name enter first name aaa-c-family-name aalast aaname
Training
| Base model | huawei-noah/TinyBERT_General_4L_312D (4 layers, hidden 312, intermediate 1200, 12 heads, ~14M params, max sequence length 512) |
| Head | BertForSequenceClassification, 66 output classes |
| Training set | ~360 real shopping / checkout / address forms, 6,691 labelled fields |
| Validation / test | ~246 forms, 4,300 fields, split into validation and test |
| Regions covered | US, CA, GB, FR, DE, BR, ES, JP, AT, IN, IT, PL, AU, CH (supported); some additional regions also represented for evaluation |
| Optimizer / schedule | Hugging Face Trainer defaults, 50 epochs |
| Hardware | Apple M1 MacBook Pro, ~75 minutes wall time |
Each form field is annotated with data-mozautofill-type="<type>" set to
the expected autofill class; fields that should not be filled receive no
attribute and are mapped to other.
Evaluation
Evaluated on the project's held-out test set (2,168 labelled fields drawn from real address / shopping forms) using ONNX Runtime on CPU.
- Total β strict exact-match accuracy.
- Close β counts predictions on closely related labels as correct
(e.g.
street-addresspredicted when ground truth isaddress-line1,telpredicted when ground truth istel-national). - Blank β false-fill rate. Fraction of
other-labelled fields the model predicted as a real autofill type. Lower is better; this metric matters most for user experience because high false-fill means filling search boxes, comments, and gift-card fields with personal data.
| Variant | Total | Close | Blank | Throughput (CPU) |
|---|---|---|---|---|
| fp32 | 89.62% | 91.51% | 2.40% | ~218/s |
| fp16 | 89.71% | 91.61% | 2.31% | ~132/s |
| bnb4 | 88.42% | 90.64% | 2.77% | ~214/s |
| q4 | 88.01% | 90.54% | 2.58% | ~209/s |
| q4f16 | 88.01% | 90.54% | 2.58% | ~95/s |
| uint8 | 87.27% | 89.53% | 3.27% | ~163/s |
| int8 / quantized | 84.82% | 87.73% | 1.94% | ~257/s |
For reference, the existing Firefox regex-based heuristic detector reaches roughly 85% total accuracy on comparable test sets.
Highlights:
- fp16 is statistically indistinguishable from fp32 across all metrics while halving the file size. It is the recommended high-fidelity variant. Latency on CPU is ~2Γ fp32 because most CPUs lack native fp16 ops, but the gap closes on hardware with fp16 support and on WebGPU.
- int8 / quantized has the lowest exact accuracy but the lowest
false-fill rate of any variant (1.94%, below the fp32 baseline). It
errs toward
otherwhen uncertain β the safer failure mode for an autofill UI. This is the recommended size-constrained default. - 4-bit variants (
q4,q4f16,bnb4) cluster around 88% total accuracy withq4f16being the smallest at 22 MB.
Limitations
- Trained primarily on the supported-region list above. Accuracy on unsupported regions trained-without-data drops ~5β10 percentage points; adding region-specific samples to the training set typically recovers most of that gap.
- Underrepresented field types (
address-line3,additional-name,phonetic-*,tel-local-prefix, etc.) have very few training examples and are sometimes confidently misclassified. - Quantized variants disagree with fp32 on roughly 0.1% (
fp16) to ~5% (int8) of inputs. The exact disagreement pattern is captured in the evaluation table above. - The model assumes the team's preprocessing format (camelCase-split,
lowercased, with optional
a-c-/bb/aamarkers). Feeding raw HTML attribute strings without this normalisation will degrade accuracy.
Citation
This model is built on TinyBERT:
@inproceedings{jiao-etal-2020-tinybert,
title = {{TinyBERT}: Distilling {BERT} for Natural Language Understanding},
author = {Jiao, Xiaoqi and Yin, Yichun and Shang, Lifeng and Jiang, Xin
and Chen, Xiao and Li, Linlin and Wang, Fang and Liu, Qun},
booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2020},
year = {2020},
pages = {4163--4174},
url = {https://aclanthology.org/2020.findings-emnlp.372}
}
If you use this checkpoint, please also cite the Mozilla autofill ML investigation that produced it (citation forthcoming).
License
Apache 2.0.
- Downloads last month
- 16
Model tree for Mozilla/tinybert-address-autofill
Base model
huawei-noah/TinyBERT_General_4L_312D