nameprediction / README.md
romor's picture
readme: full model card
0666257 verified
---
license: agpl-3.0
tags:
- pytorch
- names
- gender-classification
- text-classification
library_name: pytorch
---
# nameprediction
This repository hosts the published model weights for `nameprediction`, a byte-level transformer that predicts:
- gender label
- female probability (`f_prob`)
- country label
- country confidence
- region label
- region confidence
The intended runtime package is available [on PyPI](https://pypi.org/project/nameprediction/) as `nameprediction`.
## Usage
Install the Python package:
```bash
pip install nameprediction
```
Load the default published model from this Hugging Face repository:
```python
from nameprediction import NamePredictor
predictor = NamePredictor.from_pretrained(
repo_id="romor/nameprediction",
filename="name_gender_country_model_v14.pth",
)
print(predictor.predict_name("Ada Lovelace").to_dict())
```
The package also works with the built-in defaults:
```python
from nameprediction import NamePredictor
predictor = NamePredictor.from_pretrained()
print(predictor.predict_name("Ada Lovelace").to_dict())
```
## Model Summary
- Architecture: byte-level transformer over UTF-8 encoded names
- Tasks: multi-task prediction of gender, country, and region
- Checkpoint file: `name_gender_country_model_v14.pth`
- Package default model source: `romor/nameprediction`
## Intended Use
This model is intended for large-scale name-based inference workflows where approximate predictions are acceptable, for example exploratory analysis or enrichment pipelines.
## Limitations
- Predictions are probabilistic and can be wrong, especially for rare, multicultural, transliterated, or ambiguous names.
- Name-based gender inference is inherently imperfect and may not reflect a person's self-identified gender.
- Country and region predictions are weak proxies inferred from names only, not verified demographic facts.
- The model should not be used as the sole basis for decisions that affect people.
## Safety And Bias
This model may reflect biases in the training data and in the simplifying assumption that names carry reliable signals about gender, country, or region. Use caution when applying it to sensitive populations or downstream decision-making systems.
## Package Repository
- PyPI: `nameprediction`
- GitHub: `https://github.com/tomthe/nameprediction`