--- base_model: broadfield-dev/bert-small-ner-pii-tuned-12261022 library_name: transformers tags: - onnx - transformers - optimum - onnxruntime - token-classification - int8 - quantized - mobile language: en pipeline_tag: token-classification --- # ONNX Export: broadfield-dev/bert-small-ner-pii-tuned-12261022 This is a version of [broadfield-dev/bert-small-ner-pii-tuned-12261022](https://huggingface.co/broadfield-dev/bert-small-ner-pii-tuned-12261022) that has been converted to ONNX and optimized. ## Model Details - **Base Model:** `broadfield-dev/bert-small-ner-pii-tuned-12261022` - **Task:** `token-classification` - **Opset Version:** `17` - **Optimization:** `INT8 - Optimized for Mobile (ARM64)` ## Usage ### Installation ```bash pip install onnxruntime transformers ``` ### Python Example ```python from tokenizers import Tokenizer import onnxruntime as ort import numpy as np # 1. Load the lightweight tokenizer (No Transformers dependency needed) tokenizer = Tokenizer.from_pretrained("broadfield-dev/bert-small-ner-pii-tuned-12261022-onnx") # 2. Load the ONNX model session = ort.InferenceSession("model.onnx") # 3. Preprocess (Simple text encoding) text = "Run inference on mobile!" encoding = tokenizer.encode(text) # Prepare inputs (Exact names vary by model, usually input_ids + attention_mask) inputs = {{ "input_ids": np.array([encoding.ids], dtype=np.int64), "attention_mask": np.array([encoding.attention_mask], dtype=np.int64) }} # 4. Run Inference outputs = session.run(None, inputs) print("Output logits shape:", outputs[0].shape) ``` ## About this Export This model was exported using [Optimum](https://huggingface.co/docs/optimum/index) and `onnxruntime`. It includes the `INT8 - Optimized for Mobile (ARM64)` quantization settings.