Token Classification
GLiNER
ONNX
English
multilingual
ner
social-media
username-extraction
int8
quantized
cpu
Instructions to use LumeData/HandleAtlas-166m-CPU with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- GLiNER
How to use LumeData/HandleAtlas-166m-CPU with GLiNER:
from gliner import GLiNER model = GLiNER.from_pretrained("LumeData/HandleAtlas-166m-CPU") - Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| language: | |
| - en | |
| - multilingual | |
| pipeline_tag: token-classification | |
| tags: | |
| - gliner | |
| - ner | |
| - token-classification | |
| - social-media | |
| - username-extraction | |
| - onnx | |
| - int8 | |
| - quantized | |
| - cpu | |
| library_name: gliner | |
| base_model: LumeData/HandleAtlas-166m | |
| # HandleAtlas-166m-CPU | |
| CPU-optimized ONNX INT8 variant of [LumeData/HandleAtlas-166m](https://huggingface.co/LumeData/HandleAtlas-166m). | |
| ~4× smaller and 4–6× faster than the PyTorch float weights, intended for CPU inference. | |
| ## What's in this repo | |
| - `model.onnx` — fp32 ONNX export | |
| - `model_quantized.onnx` — INT8 dynamic-quantized ONNX (load this for the fastest path) | |
| - Tokenizer + GLiNER config files | |
| ## Usage (quantized + thread-tuned) | |
| ```python | |
| import os, torch | |
| import onnxruntime as ort | |
| from gliner import GLiNER | |
| # Match physical (not logical) cores. 4–8 is a good default on laptops. | |
| N_THREADS = 8 | |
| os.environ["OMP_NUM_THREADS"] = str(N_THREADS) | |
| torch.set_num_threads(N_THREADS) | |
| model = GLiNER.from_pretrained( | |
| "LumeData/HandleAtlas-166m-CPU", | |
| load_onnx_model=True, | |
| onnx_model_file="model_quantized.onnx", | |
| ) | |
| labels = ['instagram_username', 'snapchat_username', 'youtube_username', 'twitch_username', 'tiktok_username', 'discord_username', 'x_username', 'cashapp_username', 'onlyfans_username', 'tumblr_username', 'github_username', 'kofi_username', 'patreon_username', 'roblox_username', 'generic_username'] | |
| text = "Insta: foodgrammer | Snap: chefchef | DC: gamer420 | $cashtag" | |
| for ent in model.predict_entities(text, labels, threshold=0.5): | |
| print(f"{ent['text']!r} -> {ent['label']} ({ent['score']:.2f})") | |
| ``` | |
| To use the unquantized ONNX (smaller accuracy delta, ~2× faster than PyTorch): | |
| swap `onnx_model_file="model_quantized.onnx"` for `"model.onnx"`. | |
| ## Recommended thresholds | |
| - Default: `threshold=0.5` | |
| - For `generic_username`, bump to `0.65` to reduce false positives. | |
| ## Notes on quality | |
| INT8 dynamic quantization typically costs <1 F1 point on this kind of task. | |
| For applications that require the absolute best precision, use the float | |
| variant [LumeData/HandleAtlas-166m](https://huggingface.co/LumeData/HandleAtlas-166m). | |
| ## Labels | |
| - `instagram_username` | |
| - `snapchat_username` | |
| - `youtube_username` | |
| - `twitch_username` | |
| - `tiktok_username` | |
| - `discord_username` | |
| - `x_username` | |
| - `cashapp_username` | |
| - `onlyfans_username` | |
| - `tumblr_username` | |
| - `github_username` | |
| - `kofi_username` | |
| - `patreon_username` | |
| - `roblox_username` | |
| - `generic_username` | |