broadfield-dev
/

bert-small-ner-pii-mobile

Token Classification

Model card Files Files and versions

bert-small-ner-pii-mobile / README.md

broadfield-dev's picture

Update README.md

911426b verified about 2 months ago

|

history blame contribute delete

1.78 kB

	---
	base_model: broadfield-dev/bert-small-ner-pii-tuned-12261022
	library_name: transformers
	tags:
	- onnx
	- transformers
	- optimum
	- onnxruntime
	- token-classification
	- int8
	- quantized
	- mobile
	language: en
	pipeline_tag: token-classification
	---

	# ONNX Export: broadfield-dev/bert-small-ner-pii-tuned-12261022

	This is a version of [broadfield-dev/bert-small-ner-pii-tuned-12261022](https://huggingface.co/broadfield-dev/bert-small-ner-pii-tuned-12261022) that has been converted to ONNX and optimized.

	## Model Details
	- Base Model: `broadfield-dev/bert-small-ner-pii-tuned-12261022`
	- Task: `token-classification`
	- Opset Version: `17`
	- Optimization: `INT8 - Optimized for Mobile (ARM64)`

	## Usage

	### Installation
	```bash
	pip install onnxruntime transformers
	```

	### Python Example
	```python
	from tokenizers import Tokenizer
	import onnxruntime as ort
	import numpy as np

	# 1. Load the lightweight tokenizer (No Transformers dependency needed)
	tokenizer = Tokenizer.from_pretrained("broadfield-dev/bert-small-ner-pii-tuned-12261022-onnx")

	# 2. Load the ONNX model
	session = ort.InferenceSession("model.onnx")

	# 3. Preprocess (Simple text encoding)
	text = "Run inference on mobile!"
	encoding = tokenizer.encode(text)

	# Prepare inputs (Exact names vary by model, usually input_ids + attention_mask)
	inputs = {{
	"input_ids": np.array([encoding.ids], dtype=np.int64),
	"attention_mask": np.array([encoding.attention_mask], dtype=np.int64)
	}}

	# 4. Run Inference
	outputs = session.run(None, inputs)
	print("Output logits shape:", outputs[0].shape)

	```

	## About this Export
	This model was exported using [Optimum](https://huggingface.co/docs/optimum/index) and `onnxruntime`.
	It includes the `INT8 - Optimized for Mobile (ARM64)` quantization settings.