| # SIB200 Base Model | |
| This model was trained on the SIB200 dataset using random data selection. | |
| ## Training Parameters | |
| - **Dataset**: SIB200 | |
| - **Mode**: Base | |
| - **Selection Method**: Random | |
| - **Train Size**: 700 examples | |
| - **Epochs**: 20 | |
| - **Batch Size**: 8 | |
| - **Effective Batch Size**: 32 (batch_size * gradient_accumulation_steps) | |
| - **Learning Rate**: 8e-06 | |
| - **Patience**: 8 | |
| - **Max Length**: 192 | |
| - **Gradient Accumulation Steps**: 4 | |
| - **Warmup Ratio**: 0.1 | |
| - **Weight Decay**: 0.01 | |
| - **Optimizer**: AdamW | |
| - **Scheduler**: cosine_with_warmup | |
| - **Random Seed**: 42 | |
| ## Performance | |
| - **Overall Accuracy**: 78.79% | |
| - **Overall Loss**: 0.0166 | |
| ### Language-Specific Performance | |
| - **English (EN)**: 82.83% | |
| - **German (DE)**: 87.88% | |
| - **Arabic (AR)**: 54.55% | |
| - **Spanish (ES)**: 87.88% | |
| - **Hindi (HI)**: 80.81% | |
| - **Swahili (SW)**: 78.79% | |
| ## Model Information | |
| - **Base Model**: bert-base-multilingual-cased | |
| - **Task**: Topic Classification | |
| - **Languages**: 6 languages (EN, DE, AR, ES, HI, SW) |