Climate-Sentiment-AraBERT: AraBERT for Climate Change Sentiment Analysis
Overview
Climate-Sentiment-AraBERT is a specialized Arabic PLM designed for analyzing social media content and classifying climate-related posts into the climate-related sentiment classes (positive, negative, or neutral). It works on multiple dialects (Egyptian, Gulf, Laventine, Maghrebi, Mesopotamian, etc..) and MSA.
This model can be used for additional fine-tuning and also for testing.
Model Details:
- Base Model: aubmindlab/bert-base-arabertv02-twitter
- Language: Arabic
- License: Apache License 2.0
Model Inference
You can use Climate-Sentiment-AraBERT directly on any dataset to classify sentiments in a climate-related data. To use it, follow the following steps:
1. Install the required libraries Ensure that you have installed the libraries before using the model using pip:
pip install arabert transformers torch
2. Load the Model and Tokenizer
# Import required Modules
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
# Load model and Tokenizer
model_name = 'hugsanaa/Climate-Sentiment-AraBERT'
model = AutoModelForSequenceClassification.from_pretrained(model_name, return_dict=False, num_labels=3)
tokenizer = AutoTokenizer.from_pretrained(model_name)
3. Predict
# Example of neutral sentiment
text = "سيول_جدة امطار جده كسرت رقم عام 2009م، وتصنف من الظواهر الطبيعية الشديدة"
# Tokenize input
inputs = tokenizer(text, return_tensors="pt", truncation = True, padding = True)
# Make Predictions
with torch.no_grad():
logits = model(**inputs)[0] # first element of the tuple
predicted_class = torch.argmax(logits, dim=1).item()
# Interpret results
labels = ["Negative", "Neutral", "Positive"]
print(f"Prediction: {labels[predicted_class]}")
Inference using pipeline
import pandas as pd
from transformers import pipeline
import more_itertools
from tqdm import tqdm_notebook as tqdm
model = 'hugsanaa/Climate-Sentiment-AraBERT'
# load the dataset (the data must include text column)
data = pd.read_csv(your_climate_sentiment_data)
# Define maximum sequence length
max_len = 128
# generate prediction pipeline
pipe = pipeline("sentiment-analysis", model=model, device=0, return_all_scores =True, max_length=max_len, truncation=True)
preds = []
for s in tqdm(more_itertools.chunked(list(data['text']), 32)): # batching for faster inference
preds.extend(pipe(s))
# Generate final predictions
data[f'preds'] = preds
final_pred = []
for prediction in data['preds']:
final_pred.append(max(prediction, key=lambda x: x['score'])['label'])
data[f'Final Prediction'] = final_pred
Results
Below are the results obtained from testing Climate-Sentiment-AraBERT on testing samples from climate-related data
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| Negative | 0.7721 | 0.7740 | 0.7730 | 1221 |
| Neutral | 0.7267 | 0.6966 | 0.7114 | 1546 |
| Positivr | 0.7489 | 0.8041 | 0.7755 | 827 |
| Overall / Avg. | 0.7492 | 0.7582 | 0.7533 | 3594 |
- Downloads last month
- 1
Model tree for hugsanaa/Climate-Sentiment-AraBERT
Base model
aubmindlab/bert-base-arabertv02-twitter