| </markdown> | |
| # Malay Claim Classifier | |
| This model is fine-tuned on a dataset of Malaysian claims to classify them into different categories for fact-checking purposes. It's specifically designed to categorize claims in Bahasa Malaysia into 9 main categories. | |
| ## Model Description | |
| - **Model Type:** BERT-based sequence classification | |
| - **Language:** Malay/Bahasa Malaysia | |
| - **Base Model:** rmtariq/malay_classification | |
| - **Number of Labels:** 9 | |
| - **Labels:** agama, alam sekitar, ekonomi, kesihatan, pendidikan, pengguna, politik, sosial, teknologi | |
| - **Model Size:** 178M parameters | |
| - **Tensor Type:** F32 | |
| ## Category Descriptions | |
| - **agama:** Religious claims, including halal/haram issues | |
| - **alam sekitar:** Environmental claims, climate, weather, natural disasters | |
| - **ekonomi:** Economic claims, business, finance, trade | |
| - **kesihatan:** Health claims, diseases, treatments, mental health | |
| - **pendidikan:** Education claims, schools, universities, exams | |
| - **pengguna:** Consumer product claims, brands, quality, safety | |
| - **politik:** Political claims, government, policies, elections | |
| - **sosial:** Social claims, culture, entertainment, sports, crime | |
| - **teknologi:** Technology claims, digital, internet, innovations | |
| ## Usage | |
| ```python | |
| from transformers import BertTokenizer, BertForSequenceClassification | |
| import torch | |
| # Load model and tokenizer | |
| tokenizer = BertTokenizer.from_pretrained("rmtariq/malay_claim_classifier_v2") | |
| model = BertForSequenceClassification.from_pretrained("rmtariq/malay_claim_classifier_v2") | |
| # Prepare input | |
| example_claim = "Benarkah pewarna merah yang digunakan dalam makanan ringan dihasilkan daripada serangga dan tidak halal?" | |
| inputs = tokenizer(example_claim, return_tensors="pt", padding=True, truncation=True, max_length=128) | |
| # Get predictions | |
| with torch.no_grad(): | |
| outputs = model(**inputs) | |
| predictions = outputs.logits | |
| predicted_class = torch.argmax(predictions, dim=1).item() | |
| label = model.config.id2label[predicted_class] | |
| print(f"Claim: {example_claim}") | |
| print(f"Predicted Category: {label}") |