Instructions to use cstr/fasttext-lid176-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- fastText
How to use cstr/fasttext-lid176-GGUF with fastText:
from huggingface_hub import hf_hub_download import fasttext model = fasttext.load_model(hf_hub_download("cstr/fasttext-lid176-GGUF", "model.bin")) - Notebooks
- Google Colab
- Kaggle
fastText LID-176 β GGUF
GGUF build of Facebook's lid.176.bin
fastText supervised LID classifier covering 176 languages with
ISO 639-1 short codes (en, de, fr, ru, zh, ja, β¦).
Designed to drop into the crispasr runtime alongside the GlotLID-V3 GGUF for post-ASR text language identification.
β οΈ License β CC-BY-SA-3.0 (viral)
The upstream lid.176.bin is distributed under
Creative Commons Attribution-ShareAlike 3.0 (CC-BY-SA-3.0).
Anyone redistributing this GGUF β or anything derived from it that
includes its weights β must keep the same license terms:
- Attribute Facebook (the upstream authors).
- Distribute under CC-BY-SA-3.0 or a compatible license.
This is more restrictive than most ASR/LID models on the Hub. If that does not match your project's license posture, use the GlotLID-V3 GGUF (Apache-2.0) instead β it covers a strict superset of LID-176's languages.
Files
| Quant | Size | Notes |
|---|---|---|
| F16 | 63 MB | only quant shipped β the input matrix is small (dim=16) so K-quants degrade quality more than they save space |
K-quants require 256-element row alignment; LID-176's dim=16
rows are below that, so K-quants would silently fall back to legacy
Q4_0 / Q5_0 / Q8_0. The F16 build is already small enough that
quantization is not worth the loss in precision β the embedding bag
fits entirely in L1 cache regardless.
Architecture β hierarchical softmax
Unlike GlotLID (flat softmax), LID-176 uses fastText's hierarchical
softmax (HS) loss with a deterministic Huffman tree built from
training-frequency label counts. The output matrix's n_labels rows
parameterize internal tree nodes; per-label log-probability is
computed by walking from the leaf to the root and summing
log_sigmoid((2*code - 1) * (output[node] Β· hidden)) along the path.
Tensors:
lid_fasttext.embedding.weight[n_words + bucket, dim]=[2,040,010, 16]lid_fasttext.output.weight[n_labels, dim]=[176, 16]lid_fasttext.hs_path_offsets[n_labels + 1] i32β CSR-style path offsets per labellid_fasttext.hs_paths[total_steps] i32β flattened internal-node indiceslid_fasttext.hs_codes[total_steps] i8β flattened code bits (0/1) per step
Hyperparams:
n_words = 40,010bucket = 2,000,000dim = 16n_labels = 176minn, maxn = 2, 4loss = hswordNgrams = 1- HS tree: 1,855 total path steps, average path length 10.54
Use with crispasr
Standalone text-LID CLI:
crispasr-lid -m fasttext-lid176-f16.gguf --text "Hallo Welt"
# de 0.999
crispasr-lid -m fasttext-lid176-f16.gguf -k 3 < transcript.txt
Post-ASR via main crispasr binary:
crispasr -m ggml-base.bin -f input.wav \
--lid-on-transcript fasttext-lid176-f16.gguf
# lang=en conf=0.934
Validation
Per-stage diff harness against a Python ground-truth dump (uses
fasttext.predict() as the reference for HS top-1 labels):
[PASS] input_ids shape=[163] cos=1.000000 max_abs=0.00e+00
[PASS] embedding_bag_out shape=[16] cos=1.000000 max_abs=3.55e-05
[PASS] logits shape=[176] cos=1.000000 max_abs=1.48e-03
[PASS] softmax shape=[176] cos=1.000000 max_abs=6.54e-05
[PASS] top1_score shape=[1] cos=1.000000
[INFO] top1_label ours='fr' ref='fr'
8/8 multilingual smoke tests correctly identified (en/de/fr/ru/zh/ja/pt; mr/hi can swap on 2-word Devanagari inputs β this is upstream model behaviour, not a port artefact).
Conversion
Built via crispasr's models/convert-glotlid-to-gguf.py (the same
converter handles both flat-softmax GlotLID and HS LID-176 β variant
flag selects which):
python models/convert-glotlid-to-gguf.py \
--variant fasttext-lid176 \
--input /path/to/lid.176.bin \
--output fasttext-lid176-f16.gguf \
--dtype f16
The converter parses lid.176.bin directly to extract per-label
training frequencies, deterministically rebuilds fastText's Huffman
tree (port of Model::buildTree from the upstream source), and emits
the path/code metadata alongside the weights.
Citation & license
Upstream:
- Joulin, A., Grave, E., Bojanowski, P., Mikolov, T. Bag of Tricks for Efficient Text Classification. EACL 2017.
- Joulin, A., Grave, E., Bojanowski, P., Douze, M., JΓ©gou, H., Mikolov, T. FastText.zip: Compressing text classification models.
This GGUF rebuild inherits the upstream CC-BY-SA-3.0 license. Redistributors must preserve the SA terms.
- Downloads last month
- 65
16-bit
Model tree for cstr/fasttext-lid176-GGUF
Base model
facebook/fasttext-language-identification