--- license: agpl-3.0 tags: - pytorch - names - gender-classification - text-classification library_name: pytorch --- # nameprediction This repository hosts the published model weights for `nameprediction`, a byte-level transformer that predicts: - gender label - female probability (`f_prob`) - country label - country confidence - region label - region confidence The intended runtime package is available [on PyPI](https://pypi.org/project/nameprediction/) as `nameprediction`. ## Usage Install the Python package: ```bash pip install nameprediction ``` Load the default published model from this Hugging Face repository: ```python from nameprediction import NamePredictor predictor = NamePredictor.from_pretrained( repo_id="romor/nameprediction", filename="name_gender_country_model_v14.pth", ) print(predictor.predict_name("Ada Lovelace").to_dict()) ``` The package also works with the built-in defaults: ```python from nameprediction import NamePredictor predictor = NamePredictor.from_pretrained() print(predictor.predict_name("Ada Lovelace").to_dict()) ``` ## Model Summary - Architecture: byte-level transformer over UTF-8 encoded names - Tasks: multi-task prediction of gender, country, and region - Checkpoint file: `name_gender_country_model_v14.pth` - Package default model source: `romor/nameprediction` ## Intended Use This model is intended for large-scale name-based inference workflows where approximate predictions are acceptable, for example exploratory analysis or enrichment pipelines. ## Limitations - Predictions are probabilistic and can be wrong, especially for rare, multicultural, transliterated, or ambiguous names. - Name-based gender inference is inherently imperfect and may not reflect a person's self-identified gender. - Country and region predictions are weak proxies inferred from names only, not verified demographic facts. - The model should not be used as the sole basis for decisions that affect people. ## Safety And Bias This model may reflect biases in the training data and in the simplifying assumption that names carry reliable signals about gender, country, or region. Use caution when applying it to sensitive populations or downstream decision-making systems. ## Package Repository - PyPI: `nameprediction` - GitHub: `https://github.com/tomthe/nameprediction`