solonsophy
/

name-gender-classifier-ko

Text Classification

text-embeddings-inference

Model card Files Files and versions

name-gender-classifier-ko / README.md

solonsophy's picture

add Acknowledgments

a751348 verified about 1 month ago

|

history blame contribute delete

2 kB

	---
	license: apache-2.0
	language:
	- ko
	- en
	library_name: transformers
	pipeline_tag: text-classification
	tags:
	- name-gender
	- korean
	- multilingual
	- xlm-roberta
	base_model: FacebookAI/xlm-roberta-base
	---

	# Name Gender Classifier (Korean/English)

	XLM-RoBERTa 기반 한국어/영어 이름 성별 분류 모델

	## Model Description

	- Base Model: FacebookAI/xlm-roberta-base
	- Task: Text Classification (name → gender)
	- Languages: Korean (ko), English (en)
	- Labels: `male`, `female`

	## Usage

	```python
	from transformers import pipeline

	classifier = pipeline("text-classification", model="solonsophy/name-gender-classifier-ko")

	# Korean names
	classifier("민준") # → male
	classifier("서연") # → female
	classifier("김민준") # → male

	# English names
	classifier("James") # → male
	classifier("Emma") # → female

	# Cross-cultural names
	classifier("다니엘") # → male
	classifier("소피아") # → female
	```

	## Direct Model Usage

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	tokenizer = AutoTokenizer.from_pretrained("solonsophy/name-gender-classifier-ko")
	model = AutoModelForSequenceClassification.from_pretrained("solonsophy/name-gender-classifier-ko")

	def predict(name):
	inputs = tokenizer(name, return_tensors="pt", padding=True, truncation=True, max_length=32)
	with torch.no_grad():
	outputs = model(**inputs)
	probs = torch.softmax(outputs.logits, dim=1)
	pred_id = torch.argmax(probs, dim=1).item()
	return model.config.id2label[pred_id], probs[0][pred_id].item()

	print(predict("서준")) # ('male', 0.996)
	```

	## Limitations

	- Optimized for Korean and English names
	- May have lower accuracy for names from other language origins
	- Some unisex names may show lower confidence scores

	## Acknowledgments

	This model was trained with computing resources provided by [DDOK.AI](https://huggingface.co/DDOKAI).

	## License

	Apache-2.0