Text Classification
Transformers
Safetensors
English
firstname_gender
feature-extraction
gender-classification
first-name
tiny-models
spiceechat
causal-lm
custom_code
Instructions to use SpiceeChat/FirstName-Genre-Classifier-30M-SFT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use SpiceeChat/FirstName-Genre-Classifier-30M-SFT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="SpiceeChat/FirstName-Genre-Classifier-30M-SFT", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("SpiceeChat/FirstName-Genre-Classifier-30M-SFT", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
| language: en | |
| license: apache-2.0 | |
| library_name: transformers | |
| tags: | |
| - gender-classification | |
| - first-name | |
| - tiny-models | |
| - spiceechat | |
| - causal-lm | |
| pipeline_tag: text-classification | |
| datasets: | |
| - SpiceeChat/Genre-Classifier | |
| base_model: | |
| - SpiceeChat/Genre-Classifier-1-20M-BASE-BF16 | |
| <p align="center"> | |
| <img src="https://huggingface.co/spaces/SpiceeChat/README/resolve/main/SpiceeChat_org_logo.png" | |
| alt="SpiceeChat" | |
| width="120"> | |
| </p> | |
| <h1 align="center">FirstName Gender Classifier β 30M</h1> | |
| <p align="center"> | |
| <em>Lightweight, fast, and accurate β because guessing isn't a strategy.</em> | |
| </p> | |
| <p align="center"> | |
| <a href="https://huggingface.co/SpiceeChat"><img src="https://img.shields.io/badge/SpiceeChat-π₯-orange" alt="SpiceeChat"></a> | |
| <a href="https://www.apache.org/licenses/LICENSE-2.0"><img src="https://img.shields.io/badge/License-Apache%202.0-yellow" alt="License"></a> | |
| <img src="https://img.shields.io/badge/Params-~20M-blue" alt="Params"> | |
| <img src="https://img.shields.io/badge/Accuracy-84.7%25-green" alt="Accuracy"> | |
| </p> | |
| --- | |
| ## Overview | |
| This model is a fine-tuned version of a custom 20M-parameter CausalLM architecture, originally built by **PhysiQuanty**. It was trained on a combination of: | |
| - **150,000** samples from the `SpiceeChat/Genre-Classifier` dataset | |
| - **922** hand-curated examples to improve coverage and diversity | |
| The result is a compact, production-ready classifier that predicts gender from a first name with **~85% accuracy** and no unnecessary overhead. | |
| --- | |
| ## Quick Start | |
| ```python | |
| from transformers import AutoModel, AutoTokenizer | |
| model = AutoModel.from_pretrained( | |
| "SpiceeChat/FirstName-Genre-Classifier-30M-SFT", | |
| trust_remote_code=True # custom architecture, audited and safe | |
| ) | |
| tokenizer = AutoTokenizer.from_pretrained( | |
| "SpiceeChat/FirstName-Genre-Classifier-30M-SFT", | |
| trust_remote_code=True | |
| ) | |
| name = "Arjun" | |
| inputs = tokenizer(name, return_tensors="pt") | |
| pred, probs = model.predict_gender(inputs.input_ids) | |
| gender = "M" if pred.item() == 1 else "F" | |
| print(f"{name} β {gender} (confidence: {probs.max().item():.2f})") | |
| ``` | |
| **Expected output:** | |
| ``` | |
| Arjun β M (confidence: 0.98) | |
| ``` | |
| --- | |
| ## Performance | |
| | Metric | Value | | |
| |---|---| | |
| | Validation Accuracy | 84.74% | | |
| | Macro F1 | 81.06% | | |
| | Parameters | ~20M | | |
| | Model Size | 129 MB | | |
| Trained for 3 epochs with class weighting (F : M = 3:1) to handle the natural imbalance in the training data. Loss dropped cleanly from 0.41 to 0.34 across training β stable convergence, no overfitting. | |
| --- | |
| ## What Makes This Model Different | |
| - **Handles global names** β from Wei (Chinese) to Haruto (Japanese) to Ama (Ghanaian) | |
| - **Generalizes beyond dictionaries** β learns naming patterns rather than relying on lookup tables | |
| - **Custom lightweight architecture** β small enough to run comfortably on CPU | |
| - **Fully compatible** with Hugging Face Transformers β loads like any standard model | |
| --- | |
| ## Training Details | |
| | Detail | Value | | |
| |---|---| | |
| | Base model | `SpiceeChat/Genre-Classifier-1-20M-BASE-BF16` | | |
| | Training data | 150,000 + 922 custom examples | | |
| | Optimizer | AdamW (LR = 2e-5) | | |
| | Batch size | 64 (train) / 256 (eval) | | |
| | Hardware | Tesla T4 (FP16) | | |
| --- | |
| ## Notes | |
| - The model uses weight tying between `head.weight` and `tok_emb.weight`. A harmless `head.weight | MISSING` warning may appear on load β this is expected behavior. | |
| - `trust_remote_code=True` is required because the architecture is custom. The modeling code is included in this repository and fully auditable. | |
| --- | |
| ## Try It Yourself | |
| ```bash | |
| python -c " | |
| from transformers import AutoModel, AutoTokenizer | |
| model = AutoModel.from_pretrained('SpiceeChat/FirstName-Genre-Classifier-30M-SFT', trust_remote_code=True) | |
| tokenizer = AutoTokenizer.from_pretrained('SpiceeChat/FirstName-Genre-Classifier-30M-SFT', trust_remote_code=True) | |
| name = input('Enter a first name: ') | |
| inputs = tokenizer(name, return_tensors='pt') | |
| pred, _ = model.predict_gender(inputs.input_ids) | |
| print('M' if pred.item() == 1 else 'F') | |
| " | |
| ``` | |
| --- | |
| ## License | |
| Released under the Apache 2.0 license. Use it, modify it, ship it β no strings attached. | |
| --- | |
| <p align="center"> | |
| <sub>Built with a lot of caffeine β by SpiceeChat</sub> | |
| </p> | |
| >Built by **PhysiQuanty(Did the most work)** and **QuantaSparkLabs**. |