Spaces:

Rogendo
/

Masked_Language_Modeling

Sleeping

App Files Files Community

Masked_Language_Modeling / README.md

bitz support

add huggingface space metadata

b6cd449 12 days ago

preview code

Raw

History Blame Contribute Delete

2.06 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

metadata

title: AfriBERT Kenya MLM Compare
emoji: 🤖
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 5.50.0
python_version: '3.10'
app_file: app.py
pinned: false

AfriBERT Kenya Masked LM Gradio App

Gradio demo for comparing masked-language-modeling predictions from:

Base model: castorini/afriberta_large
Adapted model: Rogendo/afribert-kenya-adapted

The app uses the same tokenizer, castorini/afriberta_large, for both models so the MLM predictions are directly comparable.

The app supports Swahili, Sheng, Kenyan institutional text, M-PESA language, and English-Swahili code-switching examples.

Run locally

PyTorch does not currently install on Python 3.14. Use Python 3.10 for this app.

cd /Users/bitzsupport/Desktop/Portfoliio/afribert-kenya-mlm-gradio
python3.10 -m venv venv
source venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements.txt
export HF_TOKEN="your_huggingface_read_token"
python app.py

If python3.10 is not installed on macOS:

brew install python@3.10

If the model is public, HF_TOKEN is optional. If it is private, the token must have read access.

Optional overrides:

export MODEL_ID="Rogendo/afribert-kenya-adapted"
export ADAPTED_MODEL_ID="Rogendo/afribert-kenya-adapted"
export BASE_MODEL_ID="castorini/afriberta_large"
export TOKENIZER_ID="castorini/afriberta_large"

Hugging Face Space

Create a Gradio Space and upload:

app.py
requirements.txt
README.md
runtime.txt

Then add a Space secret named HF_TOKEN with a Hugging Face token that can read the model.

Usage

Use the tokenizer mask token shown in the app: <mask>. [MASK] is also accepted and automatically converted.

Examples:

Tulifanya meeting jana na manager akasema <mask> itakuwa ready wiki ijayo.

Msee alikuwa poa sana, akanisaidia kupata <mask> ya ofisi.

The first output table compares the base and adapted model rank-by-rank. The second table shows each model's completed sentence for every prediction.