rmtariq's picture
Update README.md
04474ee verified
</markdown>
# Malay Claim Classifier
This model is fine-tuned on a dataset of Malaysian claims to classify them into different categories for fact-checking purposes. It's specifically designed to categorize claims in Bahasa Malaysia into 9 main categories.
## Model Description
- **Model Type:** BERT-based sequence classification
- **Language:** Malay/Bahasa Malaysia
- **Base Model:** rmtariq/malay_classification
- **Number of Labels:** 9
- **Labels:** agama, alam sekitar, ekonomi, kesihatan, pendidikan, pengguna, politik, sosial, teknologi
- **Model Size:** 178M parameters
- **Tensor Type:** F32
## Category Descriptions
- **agama:** Religious claims, including halal/haram issues
- **alam sekitar:** Environmental claims, climate, weather, natural disasters
- **ekonomi:** Economic claims, business, finance, trade
- **kesihatan:** Health claims, diseases, treatments, mental health
- **pendidikan:** Education claims, schools, universities, exams
- **pengguna:** Consumer product claims, brands, quality, safety
- **politik:** Political claims, government, policies, elections
- **sosial:** Social claims, culture, entertainment, sports, crime
- **teknologi:** Technology claims, digital, internet, innovations
## Usage
```python
from transformers import BertTokenizer, BertForSequenceClassification
import torch
# Load model and tokenizer
tokenizer = BertTokenizer.from_pretrained("rmtariq/malay_claim_classifier_v2")
model = BertForSequenceClassification.from_pretrained("rmtariq/malay_claim_classifier_v2")
# Prepare input
example_claim = "Benarkah pewarna merah yang digunakan dalam makanan ringan dihasilkan daripada serangga dan tidak halal?"
inputs = tokenizer(example_claim, return_tensors="pt", padding=True, truncation=True, max_length=128)
# Get predictions
with torch.no_grad():
outputs = model(**inputs)
predictions = outputs.logits
predicted_class = torch.argmax(predictions, dim=1).item()
label = model.config.id2label[predicted_class]
print(f"Claim: {example_claim}")
print(f"Predicted Category: {label}")