Spaces:
Sleeping
Sleeping
| title: AfriBERT Kenya MLM Compare | |
| emoji: 🤖 | |
| colorFrom: green | |
| colorTo: blue | |
| sdk: gradio | |
| sdk_version: "5.50.0" | |
| python_version: "3.10" | |
| app_file: app.py | |
| pinned: false | |
| # AfriBERT Kenya Masked LM Gradio App | |
| Gradio demo for comparing masked-language-modeling predictions from: | |
| - Base model: `castorini/afriberta_large` | |
| - Adapted model: `Rogendo/afribert-kenya-adapted` | |
| The app uses the same tokenizer, `castorini/afriberta_large`, for both models so the MLM predictions are directly comparable. | |
| The app supports Swahili, Sheng, Kenyan institutional text, M-PESA language, and English-Swahili code-switching examples. | |
| ## Run locally | |
| PyTorch does not currently install on Python 3.14. Use Python 3.10 for this app. | |
| ```bash | |
| cd /Users/bitzsupport/Desktop/Portfoliio/afribert-kenya-mlm-gradio | |
| python3.10 -m venv venv | |
| source venv/bin/activate | |
| python -m pip install --upgrade pip | |
| pip install -r requirements.txt | |
| export HF_TOKEN="your_huggingface_read_token" | |
| python app.py | |
| ``` | |
| If `python3.10` is not installed on macOS: | |
| ```bash | |
| brew install python@3.10 | |
| ``` | |
| If the model is public, `HF_TOKEN` is optional. If it is private, the token must have read access. | |
| Optional overrides: | |
| ```bash | |
| export MODEL_ID="Rogendo/afribert-kenya-adapted" | |
| export ADAPTED_MODEL_ID="Rogendo/afribert-kenya-adapted" | |
| export BASE_MODEL_ID="castorini/afriberta_large" | |
| export TOKENIZER_ID="castorini/afriberta_large" | |
| ``` | |
| ## Hugging Face Space | |
| Create a Gradio Space and upload: | |
| - `app.py` | |
| - `requirements.txt` | |
| - `README.md` | |
| - `runtime.txt` | |
| Then add a Space secret named `HF_TOKEN` with a Hugging Face token that can read the model. | |
| ## Usage | |
| Use the tokenizer mask token shown in the app: `<mask>`. `[MASK]` is also accepted and automatically converted. | |
| Examples: | |
| ```text | |
| Tulifanya meeting jana na manager akasema <mask> itakuwa ready wiki ijayo. | |
| ``` | |
| ```text | |
| Msee alikuwa poa sana, akanisaidia kupata <mask> ya ofisi. | |
| ``` | |
| The first output table compares the base and adapted model rank-by-rank. The second table shows each model's completed sentence for every prediction. | |