UBC-NLP
/

afroscope-model

Model card Files Files and versions

afroscope-model / README.md

14kwonss's picture

Update README.md

20baca4 verified 5 days ago

|

history blame contribute delete

1.38 kB

	## Description

	Afroscope-model is a language identification (LID) model from the AfroScope project, fine-tuned on [Serengeti](https://huggingface.co/UBC-NLP/serengeti), supporting 713 African languages.

	For more details on the supported languages and performance, as well as significant changes from previous versions, please refer to LINK_HERE.

	- Dataset: [dataset](https://huggingface.co/datasets/14kwonss/afroscope-data)
	- Repository: [github](https://github.com/skwon01-UBC/AfroScope?tab=readme-ov-file)
	- Paper: [Arxiv](https://www.arxiv.org/pdf/2601.13346)

	---

	## How to use

	Here is how to use this model to detect the language of a given text:

	```python
	from transformers import pipeline


	afroscope_model = pipeline("text-classification", model='UBC-NLP/afroscope-model')

	input_text="Ninyepuní íne εtɩε, bε ewǐe Jesi ɔnʋ lεfε kʋkʋkpɔ cε."

	result = afroscope_model(input_text)

	# Extract the label and score from the first result
	language = result[0]['label']
	score = result[0]['score']

	print(f"detected langauge: {language}\tscore: {round(score*100, 2)}")

	```

	## Citation

	```bibtex
	@article{kwon2026afroscope,
	title={AfroScope: A Framework for Studying the Linguistic Landscape of Africa},
	author={Kwon, Sang Yun and Elmadany, AbdelRahim and Abdul-Mageed, Muhammad},
	journal={arXiv preprint arXiv:2601.13346},
	year={2026}
	}
	```