--- language: en license: apache-2.0 library_name: transformers tags: - gender-classification - first-name - tiny-models - spiceechat - causal-lm pipeline_tag: text-classification datasets: - SpiceeChat/Genre-Classifier base_model: - SpiceeChat/Genre-Classifier-1-20M-BASE-BF16 ---
Lightweight, fast, and accurate — because guessing isn't a strategy.
--- ## Overview This model is a fine-tuned version of a custom 20M-parameter CausalLM architecture, originally built by **PhysiQuanty**. It was trained on a combination of: - **150,000** samples from the `SpiceeChat/Genre-Classifier` dataset - **922** hand-curated examples to improve coverage and diversity The result is a compact, production-ready classifier that predicts gender from a first name with **~85% accuracy** and no unnecessary overhead. --- ## Quick Start ```python from transformers import AutoModel, AutoTokenizer model = AutoModel.from_pretrained( "SpiceeChat/FirstName-Genre-Classifier-30M-SFT", trust_remote_code=True # custom architecture, audited and safe ) tokenizer = AutoTokenizer.from_pretrained( "SpiceeChat/FirstName-Genre-Classifier-30M-SFT", trust_remote_code=True ) name = "Arjun" inputs = tokenizer(name, return_tensors="pt") pred, probs = model.predict_gender(inputs.input_ids) gender = "M" if pred.item() == 1 else "F" print(f"{name} → {gender} (confidence: {probs.max().item():.2f})") ``` **Expected output:** ``` Arjun → M (confidence: 0.98) ``` --- ## Performance | Metric | Value | |---|---| | Validation Accuracy | 84.74% | | Macro F1 | 81.06% | | Parameters | ~20M | | Model Size | 129 MB | Trained for 3 epochs with class weighting (F : M = 3:1) to handle the natural imbalance in the training data. Loss dropped cleanly from 0.41 to 0.34 across training — stable convergence, no overfitting. --- ## What Makes This Model Different - **Handles global names** — from Wei (Chinese) to Haruto (Japanese) to Ama (Ghanaian) - **Generalizes beyond dictionaries** — learns naming patterns rather than relying on lookup tables - **Custom lightweight architecture** — small enough to run comfortably on CPU - **Fully compatible** with Hugging Face Transformers — loads like any standard model --- ## Training Details | Detail | Value | |---|---| | Base model | `SpiceeChat/Genre-Classifier-1-20M-BASE-BF16` | | Training data | 150,000 + 922 custom examples | | Optimizer | AdamW (LR = 2e-5) | | Batch size | 64 (train) / 256 (eval) | | Hardware | Tesla T4 (FP16) | --- ## Notes - The model uses weight tying between `head.weight` and `tok_emb.weight`. A harmless `head.weight | MISSING` warning may appear on load — this is expected behavior. - `trust_remote_code=True` is required because the architecture is custom. The modeling code is included in this repository and fully auditable. --- ## Try It Yourself ```bash python -c " from transformers import AutoModel, AutoTokenizer model = AutoModel.from_pretrained('SpiceeChat/FirstName-Genre-Classifier-30M-SFT', trust_remote_code=True) tokenizer = AutoTokenizer.from_pretrained('SpiceeChat/FirstName-Genre-Classifier-30M-SFT', trust_remote_code=True) name = input('Enter a first name: ') inputs = tokenizer(name, return_tensors='pt') pred, _ = model.predict_gender(inputs.input_ids) print('M' if pred.item() == 1 else 'F') " ``` --- ## License Released under the Apache 2.0 license. Use it, modify it, ship it — no strings attached. ---Built with a lot of caffeine ☕ by SpiceeChat
>Built by **PhysiQuanty(Did the most work)** and **QuantaSparkLabs**.