---
library_name: transformers
license: apache-2.0
base_model: bert-base-uncased
tags:
- generated_from_trainer
model-index:
- name: MyPoliBERT-ver03
  results: []
datasets:
- tnwei/ms-newspapers
---

<!-- This model card has been generated automatically according to the information the Trainer had access to.  
You should proofread and complete it, then remove this comment. -->

# MyPoliBERT-ver03

## Model Overview
MyPoliBERT-ver03 is a fine-tuned version of (bert-base-uncased) on multiple datasets, designed for multi-label classification of political topics including Democracy, Economy, Race, Leadership, Development, Corruption, Instability, Safety, Administration, Education, Religion, and Environment. This version is an update of the original YagiASAFAS/MyPoliBERT model and explicitly improves the classification performance for the Leadership topic.

## Intended Uses and Limitations
- **Intended Uses**  
  This model is intended for analyzing political texts and identifying multiple political topics, with a special focus on accurately classifying leadership-related content. It can be applied to various text sources such as news articles and social media posts.

- **Limitations**  
  1. The model is fine-tuned on an unknown dataset, and details regarding the data sources are limited; therefore, its performance may vary on texts from different domains or regions.  
  2. As with most deep learning models, the internal decision process is not inherently interpretable; human review is recommended for critical applications.  
  3. The model may not reflect recent political developments due to the static nature of its training data.

## Dataset
The training and evaluation data consist of 29226 records, with an 80% training split and 20% validation split.  
Data Sources include:  
- tnwei/ms-newspapers dataset  
- Malaysian political posts from Reddit  
- Malaysian political posts from Instagram  
- Malaysian political posts from Facebook  

Additionally, to address biases in topics and sentiment observed in news as well as social media posts and comments, a portion of the data was artificially generated using Generative AI-aided Data Augmentation.

## Model Architecture
- **Base Model**: (bert-base-uncased)  
- **Task**: Multi-label classification for 12 political topics  
- **Output**: The model outputs classification scores for each topic; in this updated version the Leadership classification has been notably improved.

## Training Procedure
- **Hyperparameters**  
  - learning_rate: 3e-05  
  - train_batch_size: 16  
  - eval_batch_size: 16  
  - seed: 42  
  - gradient_accumulation_steps: 2  
  - total_train_batch_size: 32  
  - optimizer: ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08  
  - lr_scheduler_type: linear  
  - num_epochs: 16  
  - mixed_precision_training: Native AMP

- **Training Configuration**  
  The training followed a standard procedure with periodic evaluation; the best checkpoint (obtained at epoch 7) was selected based on overall performance metrics.

## Evaluation and Performance
The model achieves the following results on the evaluation set:

- Loss: 0.2655  
- Democracy F1: 0.9312  
- Democracy Accuracy: 0.9318  
- Economy F1: 0.9143  
- Economy Accuracy: 0.9151  
- Race F1: 0.9449  
- Race Accuracy: 0.9456  
- Leadership F1: 0.8488  
- Leadership Accuracy: 0.8494  
- Development F1: 0.8710  
- Development Accuracy: 0.8748  
- Corruption F1: 0.9420  
- Corruption Accuracy: 0.9441  
- Instability F1: 0.9164  
- Instability Accuracy: 0.9198  
- Safety F1: 0.9042  
- Safety Accuracy: 0.9032  
- Administration F1: 0.8831  
- Administration Accuracy: 0.8891  
- Education F1: 0.9565  
- Education Accuracy: 0.9567  
- Religion F1: 0.9426  
- Religion Accuracy: 0.9424  
- Environment F1: 0.9745  
- Environment Accuracy: 0.9746  
- Overall F1: 0.9191  
- Overall Accuracy: 0.9206  

These results demonstrate robust performance across most topics, with a particular improvement in the Leadership category compared to the original model.

### Training Results

| Training Loss | Epoch | Step | Validation Loss | Democracy F1 | Democracy Accuracy | Economy F1 | Economy Accuracy | Race F1 | Race Accuracy | Leadership F1 | Leadership Accuracy | Development F1 | Development Accuracy | Corruption F1 | Corruption Accuracy | Instability F1 | Instability Accuracy | Safety F1 | Safety Accuracy | Administration F1 | Administration Accuracy | Education F1 | Education Accuracy | Religion F1 | Religion Accuracy | Environment F1 | Environment Accuracy | Overall F1 | Overall Accuracy |
|:-------------:|:-----:|:----:|:---------------:|:------------:|:------------------:|:----------:|:----------------:|:-------:|:-------------:|:-------------:|:-------------------:|:--------------:|:--------------------:|:-------------:|:-------------------:|:--------------:|:--------------------:|:---------:|:---------------:|:-----------------:|:-----------------------:|:------------:|:------------------:|:-----------:|:-----------------:|:--------------:|:--------------------:|:----------:|:----------------:|
| 0.448         | 1.0   | 674  | 0.2781          | 0.8973       | 0.9201             | 0.8952     | 0.9062           | 0.9346  | 0.9385        | 0.8199        | 0.8340              | 0.8462         | 0.8672               | 0.9210        | 0.9302              | 0.8873         | 0.9084               | 0.8869    | 0.8947          | 0.8307            | 0.8700                  | 0.9344       | 0.9467             | 0.9219      | 0.9304            | 0.9565         | 0.9619               | 0.8943     | 0.9090           |
| 0.2646        | 2.0   | 1348 | 0.2372          | 0.9232       | 0.9335             | 0.9111     | 0.9144           | 0.9438  | 0.9467        | 0.8406        | 0.8403              | 0.8669         | 0.8739               | 0.9385        | 0.9424              | 0.9222         | 0.9278               | 0.9038    | 0.9081          | 0.8724            | 0.8869                  | 0.9543       | 0.9580             | 0.9380      | 0.9409            | 0.9732         | 0.9734               | 0.9157     | 0.9205           |
| 0.1696        | 3.0   | 2022 | 0.2291          | 0.9277       | 0.9333             | 0.9132     | 0.9177           | 0.9441  | 0.9469        | 0.8465        | 0.8503              | 0.8768         | 0.8847               | 0.9423        | 0.9454              | 0.9219         | 0.9255               | 0.9104    | 0.9114          | 0.8806            | 0.8919                  | 0.9592       | 0.9597             | 0.9407      | 0.9419            | 0.9753         | 0.9766               | 0.9199     | 0.9238           |
| 0.1309        | 4.0   | 2696 | 0.2374          | 0.9290       | 0.9344             | 0.9168     | 0.9175           | 0.9441  | 0.9452        | 0.8454        | 0.8470              | 0.8733         | 0.8804               | 0.9433        | 0.9465              | 0.9215         | 0.9233               | 0.9101    | 0.9096          | 0.8762            | 0.8758                  | 0.9577       | 0.9597             | 0.9389      | 0.9408            | 0.9740         | 0.9740               | 0.9192     | 0.9212           |
| 0.1085        | 5.0   | 3370 | 0.2414          | 0.9314       | 0.9346             | 0.9166     | 0.9175           | 0.9419  | 0.9452        | 0.8492        | 0.8459              | 0.8747         | 0.8808               | 0.9435        | 0.9463              | 0.9218         | 0.9257               | 0.9070    | 0.9083          | 0.8862            | 0.8921                  | 0.9574       | 0.9588             | 0.9420      | 0.9426            | 0.9732         | 0.9736               | 0.9204     | 0.9226           |
| 0.0759        | 6.0   | 4044 | 0.2556          | 0.9311       | 0.9313             | 0.9153     | 0.9162           | 0.9465  | 0.9473        | 0.8492        | 0.8511              | 0.8743         | 0.8810               | 0.9431        | 0.9447              | 0.9185         | 0.9205               | 0.9049    | 0.9034          | 0.8797            | 0.8886                  | 0.9588       | 0.9601             | 0.9419      | 0.9421            | 0.9753         | 0.9757               | 0.9199     | 0.9218           |
| 0.0618        | 7.0   | 4718 | 0.2655          | 0.9312       | 0.9318             | 0.9143     | 0.9151           | 0.9449  | 0.9456        | 0.8488        | 0.8494              | 0.8710         | 0.8748               | 0.9420        | 0.9441              | 0.9164         | 0.9198               | 0.9042    | 0.9032          | 0.8831            | 0.8891                  | 0.9565       | 0.9567             | 0.9426      | 0.9424            | 0.9745         | 0.9746               | 0.9191     | 0.9206           |

## Future Improvements
- Incorporate additional data and domain adaptation techniques to further improve performance across all topics.
- Enhance model interpretability using explainability methods.
- Monitor and update the model periodically to capture evolving political trends.

## License and Usage Notes
- The predictions of this model should be used as a reference and interpreted within the context of the training data limitations.
- Users are encouraged to validate model outputs with human review for critical applications.
- Regular updates and retraining are recommended to maintain relevance and accuracy.

### Framework Versions
- Transformers: 4.48.2  
- Pytorch: 2.5.1+cu124  
- Datasets: 3.2.0  
- Tokenizers: 0.21.0