initial commit

056ef4b unverified 26 days ago

7.97 kB

	---
	license: apache-2.0
	language: multilingual
	library_name: transformers.js
	pipeline_tag: text-classification
	base_model: huawei-noah/TinyBERT_General_4L_312D
	tags:
	- autofill
	- field-classification
	- bert
	- tinybert
	- onnx
	- transformers.js
	- browser
	---

	# TinyBERT Address Autofill

	A compact field-type classifier for HTML form autofill developed by the
	Credentials Management Team on Firefox. Given a string describing a single form
	field's attributes, it predicts one of 66 autofill field types (`given-name`,
	`family-name`, `email`, `postal-code`, `address-line1`, `cc-number`, etc.) or
	`other` when the field should not be filled.

	The model is fine-tuned from `huawei-noah/TinyBERT_General_4L_312D` on a
	corpus of manually annotated shopping and address forms collected by Mozilla, and is
	intended to run client-side inside Firefox (or any Transformers.js host) as
	a replacement or augmentation for the existing regex-based heuristic field
	detector.

	## ONNX variants

	All variants live under `onnx/` and are loadable through Transformers.js by
	passing the corresponding `dtype` argument.

	\| File \| Precision \| Size \| Transformers.js `dtype` \|
	\| --- \| --- \| ---: \| --- \|
	\| `onnx/model.onnx` \| fp32 \| 57.6 MB \| `fp32` \|
	\| `onnx/model_fp16.onnx` \| fp16 \| 28.9 MB \| `fp16` \|
	\| `onnx/model_quantized.onnx` \| int8 dynamic (default) \| 14.6 MB \| `q8` \|
	\| `onnx/model_int8.onnx` \| int8 dynamic \| 14.6 MB \| `int8` \|
	\| `onnx/model_uint8.onnx` \| uint8 dynamic \| 14.6 MB \| `uint8` \|
	\| `onnx/model_q4.onnx` \| 4-bit weight-only on MatMul \| 42.3 MB \| `q4` \|
	\| `onnx/model_q4f16.onnx` \| 4-bit on top of fp16 \| 22.4 MB \| `q4f16` \|
	\| `onnx/model_bnb4.onnx` \| bitsandbytes NF4 \| 41.9 MB \| `bnb4` \|

	## How to use

	### Transformers.js (browser)

	```js
	import { pipeline } from "@huggingface/transformers";

	const classifier = await pipeline(
	"text-classification",
	"vazish/tinybert-address-autofill",
	{ dtype: "q8" } // try "fp16" for highest fidelity, "q4f16" for smallest
	);

	const out = await classifier(
	"a-c-postal-code billing zip code dwfrm billing address fields postal code"
	);
	// → [{ label: "postal-code", score: 0.99 }]
	```

	### Python (Optimum + ONNX Runtime)

	```python
	from optimum.onnxruntime import ORTModelForSequenceClassification
	from transformers import AutoTokenizer, pipeline

	model = ORTModelForSequenceClassification.from_pretrained(
	"vazish/tinybert-address-autofill",
	file_name="onnx/model.onnx", # or onnx/model_quantized.onnx, etc.
	)
	tokenizer = AutoTokenizer.from_pretrained("vazish/tinybert-address-autofill")
	clf = pipeline("text-classification", model=model, tokenizer=tokenizer)

	clf("email email mail **email")
	# → [{"label": "email", "score": 0.99}]
	```

	## Input format

	The model expects a single string per field, built by concatenating that
	field's HTML attributes after light normalisation:

	1. Concatenate (in order): `type` + `autocomplete` + `id` + `name` +
	`placeholder` + the field's computed `<label>` text.
	2. Split camelCase boundaries to whitespace (`firstName` → `first name`).
	3. Lowercase the whole thing.
	4. If the field declares an `autocomplete` attribute, prepend an
	`a-c-<value>` token (e.g. `a-c-postal-code`).
	5. Optionally include adjacent-field context — `bb`-prefixed tokens for
	the previous field on the same form and `aa`-prefixed tokens for the
	next. Including adjacent context improves accuracy by roughly 8 percentage
	points relative to the same model trained on isolated fields.

	Example input for a "first name" field followed by a "last name" field:

	```
	first name first name enter first name aaa-c-family-name aalast aaname
	```

	## Training

	\| \| \|
	\| --- \| --- \|
	\| Base model \| `huawei-noah/TinyBERT_General_4L_312D` (4 layers, hidden 312, intermediate 1200, 12 heads, ~14M params, max sequence length 512) \|
	\| Head \| `BertForSequenceClassification`, 66 output classes \|
	\| Training set \| ~360 real shopping / checkout / address forms, 6,691 labelled fields \|
	\| Validation / test \| ~246 forms, 4,300 fields, split into validation and test \|
	\| Regions covered \| US, CA, GB, FR, DE, BR, ES, JP, AT, IN, IT, PL, AU, CH (supported); some additional regions also represented for evaluation \|
	\| Optimizer / schedule \| Hugging Face `Trainer` defaults, 50 epochs \|
	\| Hardware \| Apple M1 MacBook Pro, ~75 minutes wall time \|

	Each form field is annotated with `data-mozautofill-type="<type>"` set to
	the expected autofill class; fields that should not be filled receive no
	attribute and are mapped to `other`.

	## Evaluation

	Evaluated on the project's held-out test set (2,168 labelled fields drawn
	from real address / shopping forms) using ONNX Runtime on CPU.

	- Total — strict exact-match accuracy.
	- Close — counts predictions on closely related labels as correct
	(e.g. `street-address` predicted when ground truth is `address-line1`,
	`tel` predicted when ground truth is `tel-national`).
	- Blank — false-fill rate. Fraction of `other`-labelled fields the
	model predicted as a real autofill type. Lower is better; this metric
	matters most for user experience because high false-fill means filling
	search boxes, comments, and gift-card fields with personal data.

	\| Variant \| Total \| Close \| Blank \| Throughput (CPU) \|
	\| --- \| ---: \| ---: \| ---: \| ---: \|
	\| fp32 \| 89.62% \| 91.51% \| 2.40% \| ~218/s \|
	\| fp16 \| 89.71% \| 91.61% \| 2.31% \| ~132/s \|
	\| bnb4 \| 88.42% \| 90.64% \| 2.77% \| ~214/s \|
	\| q4 \| 88.01% \| 90.54% \| 2.58% \| ~209/s \|
	\| q4f16 \| 88.01% \| 90.54% \| 2.58% \| ~95/s \|
	\| uint8 \| 87.27% \| 89.53% \| 3.27% \| ~163/s \|
	\| int8 / quantized \| 84.82% \| 87.73% \| 1.94% \| ~257/s \|

	For reference, the existing Firefox regex-based heuristic detector reaches
	roughly 85% total accuracy on comparable test sets.

	Highlights:

	- fp16 is statistically indistinguishable from fp32 across all metrics
	while halving the file size. It is the recommended high-fidelity
	variant. Latency on CPU is ~2× fp32 because most CPUs lack native fp16
	ops, but the gap closes on hardware with fp16 support and on
	WebGPU.
	- int8 / quantized has the lowest exact accuracy but **the lowest
	false-fill rate of any variant** (1.94%, below the fp32 baseline). It
	errs toward `other` when uncertain — the safer failure mode for an
	autofill UI. This is the recommended size-constrained default.
	- 4-bit variants (`q4`, `q4f16`, `bnb4`) cluster around 88% total accuracy
	with `q4f16` being the smallest at 22 MB.

	## Limitations

	- Trained primarily on the supported-region list above. Accuracy on
	unsupported regions trained-without-data drops ~5–10 percentage points;
	adding region-specific samples to the training set typically recovers
	most of that gap.
	- Underrepresented field types (`address-line3`, `additional-name`,
	`phonetic-*`, `tel-local-prefix`, etc.) have very few training examples
	and are sometimes confidently misclassified.
	- Quantized variants disagree with fp32 on roughly 0.1% (`fp16`) to ~5%
	(`int8`) of inputs. The exact disagreement pattern is captured in the
	evaluation table above.
	- The model assumes the team's preprocessing format (camelCase-split,
	lowercased, with optional `a-c-`/`bb`/`aa` markers). Feeding raw HTML
	attribute strings without this normalisation will degrade accuracy.

	## Citation

	This model is built on TinyBERT:

	```bibtex
	@inproceedings{jiao-etal-2020-tinybert,
	title = {{TinyBERT}: Distilling {BERT} for Natural Language Understanding},
	author = {Jiao, Xiaoqi and Yin, Yichun and Shang, Lifeng and Jiang, Xin
	and Chen, Xiao and Li, Linlin and Wang, Fang and Liu, Qun},
	booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2020},
	year = {2020},
	pages = {4163--4174},
	url = {https://aclanthology.org/2020.findings-emnlp.372}
	}
	```

	If you use this checkpoint, please also cite the Mozilla autofill ML
	investigation that produced it (citation forthcoming).

	## License

	Apache 2.0.