Token Classification
GLiNER
ONNX
English
multilingual
ner
social-media
username-extraction
int8
quantized
cpu
Instructions to use LumeData/HandleAtlas-166m-CPU with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- GLiNER
How to use LumeData/HandleAtlas-166m-CPU with GLiNER:
from gliner import GLiNER model = GLiNER.from_pretrained("LumeData/HandleAtlas-166m-CPU") - Notebooks
- Google Colab
- Kaggle
File size: 2,464 Bytes
987f87d | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | ---
license: apache-2.0
language:
- en
- multilingual
pipeline_tag: token-classification
tags:
- gliner
- ner
- token-classification
- social-media
- username-extraction
- onnx
- int8
- quantized
- cpu
library_name: gliner
base_model: LumeData/HandleAtlas-166m
---
# HandleAtlas-166m-CPU
CPU-optimized ONNX INT8 variant of [LumeData/HandleAtlas-166m](https://huggingface.co/LumeData/HandleAtlas-166m).
~4× smaller and 4–6× faster than the PyTorch float weights, intended for CPU inference.
## What's in this repo
- `model.onnx` — fp32 ONNX export
- `model_quantized.onnx` — INT8 dynamic-quantized ONNX (load this for the fastest path)
- Tokenizer + GLiNER config files
## Usage (quantized + thread-tuned)
```python
import os, torch
import onnxruntime as ort
from gliner import GLiNER
# Match physical (not logical) cores. 4–8 is a good default on laptops.
N_THREADS = 8
os.environ["OMP_NUM_THREADS"] = str(N_THREADS)
torch.set_num_threads(N_THREADS)
model = GLiNER.from_pretrained(
"LumeData/HandleAtlas-166m-CPU",
load_onnx_model=True,
onnx_model_file="model_quantized.onnx",
)
labels = ['instagram_username', 'snapchat_username', 'youtube_username', 'twitch_username', 'tiktok_username', 'discord_username', 'x_username', 'cashapp_username', 'onlyfans_username', 'tumblr_username', 'github_username', 'kofi_username', 'patreon_username', 'roblox_username', 'generic_username']
text = "Insta: foodgrammer | Snap: chefchef | DC: gamer420 | $cashtag"
for ent in model.predict_entities(text, labels, threshold=0.5):
print(f"{ent['text']!r} -> {ent['label']} ({ent['score']:.2f})")
```
To use the unquantized ONNX (smaller accuracy delta, ~2× faster than PyTorch):
swap `onnx_model_file="model_quantized.onnx"` for `"model.onnx"`.
## Recommended thresholds
- Default: `threshold=0.5`
- For `generic_username`, bump to `0.65` to reduce false positives.
## Notes on quality
INT8 dynamic quantization typically costs <1 F1 point on this kind of task.
For applications that require the absolute best precision, use the float
variant [LumeData/HandleAtlas-166m](https://huggingface.co/LumeData/HandleAtlas-166m).
## Labels
- `instagram_username`
- `snapchat_username`
- `youtube_username`
- `twitch_username`
- `tiktok_username`
- `discord_username`
- `x_username`
- `cashapp_username`
- `onlyfans_username`
- `tumblr_username`
- `github_username`
- `kofi_username`
- `patreon_username`
- `roblox_username`
- `generic_username`
|