LumeData
/

HandleAtlas-166m-CPU

Token Classification

username-extraction

Model card Files Files and versions

HandleAtlas-166m-CPU / README.md

Techpro864's picture

ONNX fp32 + INT8 CPU variant of HandleAtlas-166m

987f87d verified 4 days ago

|

History Blame Contribute Delete

2.46 kB

	---
	license: apache-2.0
	language:
	- en
	- multilingual
	pipeline_tag: token-classification
	tags:
	- gliner
	- ner
	- token-classification
	- social-media
	- username-extraction
	- onnx
	- int8
	- quantized
	- cpu
	library_name: gliner
	base_model: LumeData/HandleAtlas-166m
	---

	# HandleAtlas-166m-CPU

	CPU-optimized ONNX INT8 variant of [LumeData/HandleAtlas-166m](https://huggingface.co/LumeData/HandleAtlas-166m).
	~4× smaller and 4–6× faster than the PyTorch float weights, intended for CPU inference.

	## What's in this repo

	- `model.onnx` — fp32 ONNX export
	- `model_quantized.onnx` — INT8 dynamic-quantized ONNX (load this for the fastest path)
	- Tokenizer + GLiNER config files

	## Usage (quantized + thread-tuned)

	```python
	import os, torch
	import onnxruntime as ort
	from gliner import GLiNER

	# Match physical (not logical) cores. 4–8 is a good default on laptops.
	N_THREADS = 8
	os.environ["OMP_NUM_THREADS"] = str(N_THREADS)
	torch.set_num_threads(N_THREADS)

	model = GLiNER.from_pretrained(
	"LumeData/HandleAtlas-166m-CPU",
	load_onnx_model=True,
	onnx_model_file="model_quantized.onnx",
	)

	labels = ['instagram_username', 'snapchat_username', 'youtube_username', 'twitch_username', 'tiktok_username', 'discord_username', 'x_username', 'cashapp_username', 'onlyfans_username', 'tumblr_username', 'github_username', 'kofi_username', 'patreon_username', 'roblox_username', 'generic_username']

	text = "Insta: foodgrammer \| Snap: chefchef \| DC: gamer420 \| $cashtag"
	for ent in model.predict_entities(text, labels, threshold=0.5):
	print(f"{ent['text']!r} -> {ent['label']} ({ent['score']:.2f})")
	```

	To use the unquantized ONNX (smaller accuracy delta, ~2× faster than PyTorch):
	swap `onnx_model_file="model_quantized.onnx"` for `"model.onnx"`.

	## Recommended thresholds

	- Default: `threshold=0.5`
	- For `generic_username`, bump to `0.65` to reduce false positives.

	## Notes on quality

	INT8 dynamic quantization typically costs <1 F1 point on this kind of task.
	For applications that require the absolute best precision, use the float
	variant [LumeData/HandleAtlas-166m](https://huggingface.co/LumeData/HandleAtlas-166m).

	## Labels

	- `instagram_username`
	- `snapchat_username`
	- `youtube_username`
	- `twitch_username`
	- `tiktok_username`
	- `discord_username`
	- `x_username`
	- `cashapp_username`
	- `onlyfans_username`
	- `tumblr_username`
	- `github_username`
	- `kofi_username`
	- `patreon_username`
	- `roblox_username`
	- `generic_username`