genderBR: Gender Prediction from Brazilian First Names

A character-level neural network that predicts the probability of a Brazilian first name being female.

Model

2-layer bidirectional GRU with attention pooling.

Parameter Value
Embedding dim 64
Hidden dim 192
GRU layers 2 (bidirectional)
Pooling Learned attention
Dropout 0.1 (embedding), 0.2 (inter-layer), 0.4 (output)
Parameters ~600K

Training

  • Data: 142K unique names from IBGE Census (2010 & 2022)
  • Target: Probability of a name being female (continuous, 0โ€“1)
  • Loss: BCE with logits
  • Optimizer: Adam (lr=1e-3, weight_decay=1e-4)
  • Split: 80/10/10 train/validation/test
  • Early stopping: patience=5 on validation loss
  • Framework: R torch + luz

Performance (held-out test set)

Metric Value
BCE loss 0.110
Accuracy (threshold 0.5) 96.5%

Usage

# install.packages("genderBR")
library(genderBR)

download_gender_model()  # one-time download

get_gender_nn("Maria")
#> "Female"

get_gender_nn(c("Lusjane", "Joao"), prob = TRUE)
#> 0.95  0.02

Files

genderbr_weights.pt โ€” model state dict (R torch format) genderbr_vocab.rds โ€” vocabulary and hyperparameters

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support