QuantaSparkLabs's picture
Update README.md
b1a0e7f verified
|
Raw
History Blame Contribute Delete
4.32 kB
---
language: en
license: apache-2.0
library_name: transformers
tags:
- gender-classification
- first-name
- tiny-models
- spiceechat
- causal-lm
pipeline_tag: text-classification
datasets:
- SpiceeChat/Genre-Classifier
base_model:
- SpiceeChat/Genre-Classifier-1-20M-BASE-BF16
---
<p align="center">
<img src="https://huggingface.co/spaces/SpiceeChat/README/resolve/main/SpiceeChat_org_logo.png"
alt="SpiceeChat"
width="120">
</p>
<h1 align="center">FirstName Gender Classifier β€” 30M</h1>
<p align="center">
<em>Lightweight, fast, and accurate β€” because guessing isn't a strategy.</em>
</p>
<p align="center">
<a href="https://huggingface.co/SpiceeChat"><img src="https://img.shields.io/badge/SpiceeChat-πŸ”₯-orange" alt="SpiceeChat"></a>
<a href="https://www.apache.org/licenses/LICENSE-2.0"><img src="https://img.shields.io/badge/License-Apache%202.0-yellow" alt="License"></a>
<img src="https://img.shields.io/badge/Params-~20M-blue" alt="Params">
<img src="https://img.shields.io/badge/Accuracy-84.7%25-green" alt="Accuracy">
</p>
---
## Overview
This model is a fine-tuned version of a custom 20M-parameter CausalLM architecture, originally built by **PhysiQuanty**. It was trained on a combination of:
- **150,000** samples from the `SpiceeChat/Genre-Classifier` dataset
- **922** hand-curated examples to improve coverage and diversity
The result is a compact, production-ready classifier that predicts gender from a first name with **~85% accuracy** and no unnecessary overhead.
---
## Quick Start
```python
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained(
"SpiceeChat/FirstName-Genre-Classifier-30M-SFT",
trust_remote_code=True # custom architecture, audited and safe
)
tokenizer = AutoTokenizer.from_pretrained(
"SpiceeChat/FirstName-Genre-Classifier-30M-SFT",
trust_remote_code=True
)
name = "Arjun"
inputs = tokenizer(name, return_tensors="pt")
pred, probs = model.predict_gender(inputs.input_ids)
gender = "M" if pred.item() == 1 else "F"
print(f"{name} β†’ {gender} (confidence: {probs.max().item():.2f})")
```
**Expected output:**
```
Arjun β†’ M (confidence: 0.98)
```
---
## Performance
| Metric | Value |
|---|---|
| Validation Accuracy | 84.74% |
| Macro F1 | 81.06% |
| Parameters | ~20M |
| Model Size | 129 MB |
Trained for 3 epochs with class weighting (F : M = 3:1) to handle the natural imbalance in the training data. Loss dropped cleanly from 0.41 to 0.34 across training β€” stable convergence, no overfitting.
---
## What Makes This Model Different
- **Handles global names** β€” from Wei (Chinese) to Haruto (Japanese) to Ama (Ghanaian)
- **Generalizes beyond dictionaries** β€” learns naming patterns rather than relying on lookup tables
- **Custom lightweight architecture** β€” small enough to run comfortably on CPU
- **Fully compatible** with Hugging Face Transformers β€” loads like any standard model
---
## Training Details
| Detail | Value |
|---|---|
| Base model | `SpiceeChat/Genre-Classifier-1-20M-BASE-BF16` |
| Training data | 150,000 + 922 custom examples |
| Optimizer | AdamW (LR = 2e-5) |
| Batch size | 64 (train) / 256 (eval) |
| Hardware | Tesla T4 (FP16) |
---
## Notes
- The model uses weight tying between `head.weight` and `tok_emb.weight`. A harmless `head.weight | MISSING` warning may appear on load β€” this is expected behavior.
- `trust_remote_code=True` is required because the architecture is custom. The modeling code is included in this repository and fully auditable.
---
## Try It Yourself
```bash
python -c "
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained('SpiceeChat/FirstName-Genre-Classifier-30M-SFT', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained('SpiceeChat/FirstName-Genre-Classifier-30M-SFT', trust_remote_code=True)
name = input('Enter a first name: ')
inputs = tokenizer(name, return_tensors='pt')
pred, _ = model.predict_gender(inputs.input_ids)
print('M' if pred.item() == 1 else 'F')
"
```
---
## License
Released under the Apache 2.0 license. Use it, modify it, ship it β€” no strings attached.
---
<p align="center">
<sub>Built with a lot of caffeine β˜• by SpiceeChat</sub>
</p>
>Built by **PhysiQuanty(Did the most work)** and **QuantaSparkLabs**.