## Description Afroscope-model is a language identification (LID) model from the AfroScope project, fine-tuned on [Serengeti](https://huggingface.co/UBC-NLP/serengeti), supporting 713 African languages. For more details on the supported languages and performance, as well as significant changes from previous versions, please refer to LINK_HERE. - **Dataset:** [dataset](https://huggingface.co/datasets/14kwonss/afroscope-data) - **Repository:** [github](https://github.com/skwon01-UBC/AfroScope?tab=readme-ov-file) - **Paper:** [Arxiv](https://www.arxiv.org/pdf/2601.13346) --- ## How to use Here is how to use this model to detect the language of a given text: ```python from transformers import pipeline afroscope_model = pipeline("text-classification", model='UBC-NLP/afroscope-model') input_text="Ninyepuní íne εtɩε, bε ewǐe Jesi ɔnʋ lεfε kʋkʋkpɔ cε." result = afroscope_model(input_text) # Extract the label and score from the first result language = result[0]['label'] score = result[0]['score'] print(f"detected langauge: {language}\tscore: {round(score*100, 2)}") ``` ## Citation ```bibtex @article{kwon2026afroscope, title={AfroScope: A Framework for Studying the Linguistic Landscape of Africa}, author={Kwon, Sang Yun and Elmadany, AbdelRahim and Abdul-Mageed, Muhammad}, journal={arXiv preprint arXiv:2601.13346}, year={2026} } ```