marcuskd/reviews_binary_not4_concat
Viewer • Updated • 29.4k • 41
How to use marcuskd/norbert2_sentiment_test1 with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="marcuskd/norbert2_sentiment_test1") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("marcuskd/norbert2_sentiment_test1")
model = AutoModelForSequenceClassification.from_pretrained("marcuskd/norbert2_sentiment_test1")# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("marcuskd/norbert2_sentiment_test1")
model = AutoModelForSequenceClassification.from_pretrained("marcuskd/norbert2_sentiment_test1")Sentiment analysis for Norwegian reviews.
This model is trained using a self-concatinated dataset consisting of Norwegian Review Corpus dataset (https://github.com/ltgoslo/norec) and a sentiment dataset from huggingface (https://huggingface.co/datasets/sepidmnorozy/Norwegian_sentiment). Its purpose is merely for testing.
Plug in Norwegian sentences to check its sentiment (negative to positive)
https://huggingface.co/datasets/marcuskd/reviews_binary_not4_concat
Tokenized using:
tokenizer = AutoTokenizer.from_pretrained("ltgoslo/norbert2")
Training arguments for this model:
training_args = TrainingArguments(
output_dir='./results', # output directory
num_train_epochs=10, # total number of training epochs
per_device_train_batch_size=16, # batch size per device during training
per_device_eval_batch_size=64, # batch size for evaluation
warmup_steps=500, # number of warmup steps for learning rate scheduler
weight_decay=0.01, # strength of weight decay
logging_dir='./logs', # directory for storing logs
logging_steps=10,
)
Evaluation by testing using test-split of dataset.
{
'accuracy': 0.8357214261912695,
'recall': 0.886873508353222,
'precision': 0.8789025543992431,
'f1': 0.8828700403896412,
'total_time_in_seconds': 94.33071640000003,
'samples_per_second': 31.81360340013276,
'latency_in_seconds': 0.03143309443518828
}
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="marcuskd/norbert2_sentiment_test1")