This is the expand version of dleemiller/WordLlamaDetect model. Stacking two wordllama-based models to enhance the performace

Support languages: 148

Training data (740k samples)
        │
        ▼
┌───────────────────────────────────┐
│         Phase 1: Base Models      │
│                                   │
│  ┌─────────────┐  ┌─────────────┐ │
│  │LID Model 01 │  │ LID Model 02│ │
│  └──────┬──────┘  └──────┬──────┘ │
└─────────┼────────────────┼────────┘
          │  train each    │
          │  independently │
          ▼                ▼
     lid_models[0]    lid_models[1]
          │                │
          └───────┬────────┘
                  │
                  ▼
          collect_preds() → X: (N, 2*148) = (N, 296)

model1 logits      model2 logits
  (N, 148)    cat    (N, 148)
      └────────┬──────────┘
               ▼
           (N, 296)
               │
       Linear(296 → 148)           ← 296*148 = 43,808 params trained
               │
               ▼
          (N, 148) → CrossEntropy(y)

Evaluation results on Flores +

Pair	Num Languages	Accuracy	F1 Macro	Metric per Base Model
gemma3_27b + gemma_300m	148	0.9307	0.9303	gemma3_27b: Acc: 0.9147, F1: 0.9149 gemma_300m: Acc: 0.9087, F1: 0.9078

How to use code:

import sys
from pathlib import Path
from huggingface_hub import snapshot_download

# Download all files
local_dir = snapshot_download(repo_id="Bonkh/lid-stack-model-gemma_3_27b-gemma_3_300m")

# Load model.py
sys.path.insert(0, local_dir)
from model import LIDStack

# Load model
model = LIDStack.from_pretrained("Bonkh/lid-stack-model-gemma_3_27b-gemma_3_300m")

# Inference
print(model.predict("Hello, how are you?"))               # → "eng_Latn"
print(model.predict(["Bonjour", "こんにちは"]))              # → ["fra_Latn", "jpn_Jpan"]
print(model.predict("Xin chào", return_probs=True))       # → [("vie_Latn", 0.97)]

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Bonkh/lid-stack-model-gemma_3_27b-gemma_3_300m

Base model

dleemiller/WordLlamaDetect

Finetuned

(3)

this model

Dataset used to train Bonkh/lid-stack-model-gemma_3_27b-gemma_3_300m

Collection including Bonkh/lid-stack-model-gemma_3_27b-gemma_3_300m

WordLlama -based-LID-models

Collection

This is the collection of LID models trained based on the structures of • 2 items • Updated May 3