| --- |
| language: |
| - 'no' |
| - nb |
| - nn |
| license: cc-by-4.0 |
| datasets: |
| - ltg/norec_sentence |
| pipeline_tag: text-classification |
| --- |
| |
| # Sentence-level Sentiment Analysis model for Norwegian text |
| This model is a fine-tuned version of [ltg/norbert3-base](https://huggingface.co/ltg/norbert3-base) for text classification. |
|
|
| ## Training data |
| The dataset used for fine-tuning is [ltg/norec_sentence](https://huggingface.co/datasets/ltg/norec_sentence), the `mixed` subset with four sentement categories: |
| ``` |
| [0]: Negative, |
| [1]: Positive, |
| [2]: Neutral |
| [0,1]: Mixed |
| ``` |
|
|
| ## Quick start |
| You can use this model for inference as follows: |
| ``` |
| >>> from transformers import pipeline |
| >>> origin = "ltg/norbert3-large_sentence-sentiment" |
| >>> pipe = transformers.pipeline( "text-classification", |
| ... model = origin, |
| ... trust_remote_code=origin.startswith("ltg/norbert3"), |
| ... config= origin, |
| ... tokenizer = AutoTokenizer.from_pretrained(origin) |
| ... ) |
| >>> preds = pipe(["Hans hese, litt såre stemme kler bluesen, men denne platen kommer neppe til å bli blant hans største kommersielle suksesser.", |
| ... "Borten-regjeringen gjorde ikke jobben sin." ]) |
| >>> for p in preds: |
| ... print(p) |
| ``` |
| Output: |
| ``` |
| The model 'NorbertForSequenceClassification' is not supported for text-classification. Supported models are ['AlbertForSequenceClassification', ... |
| {'label': 'Mixed', 'score': 0.7435498237609863} |
| {'label': 'Negative', 'score': 0.765734851360321} |
| ``` |
|
|
| ## Training hyperparameters |
| - per_device_train_batch_size: 32 |
| - learning_rate: 1e-05 |
| - gradient_accumulation_steps: 1 |
| - num_train_epochs: 10 (best epoch 2) |
| |
| ## Evaluation |
| | Category | F1 | | |
| |:----------------|---------:|----:| |
| | Negative_F1 | 0.670241 |<img width=400/> | |
| | Positive_F1 | 0.832918 | | |
| | Neutral_F1 | 0.850082 | | |
| | Mixed_F1 | 0.580645 | | |
| | Weighted_avg_F1 | 0.799663 | | |
| |
| |