anpmts commited on
Commit
c27cd1b
·
verified ·
1 Parent(s): ec49d62

Upload sentiment classifier model

Browse files
Files changed (2) hide show
  1. config.yaml +179 -0
  2. final_model/README.md +114 -0
config.yaml ADDED
@@ -0,0 +1,179 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ model:
2
+ name: sentiment_classifier
3
+ type: classification
4
+ model:
5
+ pretrained_model: xlm-roberta-base
6
+ num_labels: 3
7
+ dropout: 0.1
8
+ hidden_size: 768
9
+ labels:
10
+ - negative
11
+ - neutral
12
+ - positive
13
+ class_weights: null
14
+ tokenizer:
15
+ max_length: 256
16
+ padding: max_length
17
+ truncation: true
18
+ add_special_tokens: true
19
+ huggingface_hub:
20
+ enabled: true
21
+ repo_id: anpmts/sentiment-classifier
22
+ private: false
23
+ create_model_card: true
24
+ commit_message: Upload sentiment classifier model
25
+ model_card:
26
+ language: multilingual
27
+ license: apache-2.0
28
+ tags:
29
+ - sentiment-analysis
30
+ - text-classification
31
+ - xlm-roberta
32
+ - sequence-classification
33
+ datasets: null
34
+ training:
35
+ epochs: 10
36
+ batch_size: 128
37
+ gradient_accumulation_steps: 1
38
+ max_grad_norm: 1.0
39
+ distributed:
40
+ enabled: true
41
+ backend: nccl
42
+ find_unused_parameters: true
43
+ precision:
44
+ mode: bf16
45
+ performance:
46
+ torch_compile: false
47
+ compile_mode: reduce-overhead
48
+ cudnn_benchmark: true
49
+ gradient_checkpointing: false
50
+ tf32: true
51
+ flash_attention_2: false
52
+ matmul_precision: high
53
+ channels_last: false
54
+ optimizer:
55
+ type: adamw
56
+ lr: 2.0e-05
57
+ weight_decay: 0.01
58
+ eps: 1.0e-08
59
+ betas:
60
+ - 0.9
61
+ - 0.999
62
+ fused: false
63
+ scheduler:
64
+ type: cosine
65
+ warmup_ratio: 0.1
66
+ warmup_steps: null
67
+ num_cycles: 0.5
68
+ early_stopping:
69
+ enabled: true
70
+ patience: 3
71
+ min_delta: 0.001
72
+ monitor: val_loss
73
+ mode: min
74
+ checkpoint:
75
+ save_top_k: 2
76
+ monitor: val_loss
77
+ mode: min
78
+ save_last: true
79
+ every_n_epochs: 1
80
+ resume_from_checkpoint: true
81
+ pretrained_checkpoint: null
82
+ load_only_model: true
83
+ eval:
84
+ eval_every_n_steps: null
85
+ eval_accumulation_steps: 1
86
+ dataloader:
87
+ num_workers: 0
88
+ pin_memory: true
89
+ persistent_workers: false
90
+ prefetch_factor: null
91
+ deterministic: false
92
+ benchmark: true
93
+ data:
94
+ data_source: local
95
+ chunked:
96
+ enabled: false
97
+ train_path: data/amazon_reviews/train
98
+ val_path: data/amazon_reviews/validation
99
+ test_path: data/amazon_reviews/test
100
+ chunk_size: 100000
101
+ total_train_samples: 3600000
102
+ text_field: text
103
+ label_field: sentiment_label
104
+ huggingface:
105
+ repo: anpmts/trustshop
106
+ split_mapping:
107
+ train: train
108
+ val: validation
109
+ test: test
110
+ field_mapping:
111
+ text: text
112
+ sentiment_label: sentiment_label
113
+ sentiment_score: sentiment_score
114
+ quality_label: quality
115
+ config_name: null
116
+ revision: null
117
+ max_samples: null
118
+ local:
119
+ data_dir: data/amazon_reviews
120
+ processed_dir: data/processed/amazon_reviews
121
+ split:
122
+ train: 0.7
123
+ val: 0.15
124
+ test: 0.15
125
+ stratify: true
126
+ filter_quality:
127
+ enabled: false
128
+ keep_labels:
129
+ - valid
130
+ class_balancing:
131
+ enabled: false
132
+ strategy: oversample
133
+ oversample:
134
+ sampling_strategy: auto
135
+ smote:
136
+ k_neighbors: 5
137
+ sampling_strategy: auto
138
+ augmentation:
139
+ enabled: false
140
+ techniques:
141
+ - synonym_replacement
142
+ - random_deletion
143
+ - random_swap
144
+ augment_ratio: 0.1
145
+ preprocessing:
146
+ lowercase: false
147
+ remove_urls: true
148
+ remove_email: true
149
+ remove_special_chars: false
150
+ min_text_length: 10
151
+ cache:
152
+ enabled: true
153
+ cache_dir: data/.cache/amazon_reviews
154
+ seed: 42
155
+ validation:
156
+ check_missing_fields: false
157
+ check_empty_text: true
158
+ log_invalid_samples: true
159
+ project:
160
+ name: ts-train
161
+ seed: 42
162
+ device: cuda
163
+ mixed_precision: true
164
+ paths:
165
+ data_dir: data
166
+ data_file: data/output.jsonl
167
+ output_dir: outputs
168
+ model_dir: models
169
+ log_dir: logs
170
+ logging:
171
+ use_wandb: true
172
+ wandb_project: ts-absa-classification
173
+ wandb_entity: null
174
+ use_tensorboard: true
175
+ log_interval: 10
176
+ experiment:
177
+ name: null
178
+ tags: []
179
+ notes: ''
final_model/README.md ADDED
@@ -0,0 +1,114 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: multilingual
3
+ license: apache-2.0
4
+ tags:
5
+ - sentiment-analysis
6
+ - text-classification
7
+ - xlm-roberta
8
+ - dual-head
9
+ ---
10
+
11
+ # Sentiment Classifier
12
+
13
+ ## Model Description
14
+
15
+ This is a dual-head sentiment classifier built on top of XLM-RoBERTa. The model performs two tasks simultaneously:
16
+
17
+ 1. **Sentiment Classification:** Predicts sentiment labels (positive, neutral, negative)
18
+ 2. **Sentiment Score Regression:** Predicts a continuous sentiment score in the range [0, 1]
19
+
20
+ The model uses a weighted loss function combining cross-entropy (70%) for classification and MSE (30%) for regression,
21
+ allowing it to capture both discrete sentiment categories and fine-grained sentiment intensity.
22
+
23
+ ## Model Architecture
24
+
25
+ - **Base Model:** xlm-roberta-base
26
+ - **Task:** text-classification
27
+ - **Number of Labels:** 3
28
+ - **Labels:** negative, neutral, positive
29
+
30
+ ## Training Configuration
31
+
32
+ - **Epochs:** 10
33
+ - **Batch Size:** 128
34
+ - **Learning Rate:** 2e-05
35
+ - **Warmup Ratio:** 0.1
36
+ - **Weight Decay:** 0.01
37
+ - **Max Seq Length:** 256
38
+ - **Mixed Precision:** FP16=False, BF16=True
39
+
40
+ ## Performance Metrics
41
+
42
+ - **Loss:** 0.6947
43
+ - **Accuracy:** 0.4901
44
+ - **Precision:** 0.2402
45
+ - **Recall:** 0.4901
46
+ - **F1:** 0.3224
47
+ - **F1 Macro:** 0.3289
48
+ - **F1 Negative:** 0.0000
49
+ - **Precision Negative:** 0.0000
50
+ - **Recall Negative:** 0.0000
51
+ - **Support Negative:** 900
52
+ - **F1 Neutral:** 0.6578
53
+ - **Precision Neutral:** 0.4901
54
+ - **Recall Neutral:** 1.0000
55
+ - **Support Neutral:** 865
56
+ - **Runtime:** 0.7012
57
+ - **Samples Per Second:** 2517.1350
58
+ - **Steps Per Second:** 9.9830
59
+
60
+ ## Model Outputs
61
+
62
+ The model returns two outputs:
63
+
64
+ - **Logits:** Classification logits for sentiment labels [batch_size, 3]
65
+ - **Score Predictions:** Continuous sentiment scores [batch_size]
66
+
67
+ Both outputs are computed from the same shared representation (CLS token) of the input text.
68
+
69
+ ## Intended Use
70
+
71
+ This model is intended for sentiment analysis tasks on multilingual text, particularly in scenarios where both
72
+ categorical sentiment (positive/neutral/negative) and sentiment intensity are important.
73
+
74
+ **Typical use cases:**
75
+ - Product review analysis
76
+ - Social media sentiment monitoring
77
+ - Customer feedback classification
78
+
79
+ ## Usage
80
+
81
+ ```python
82
+ from transformers import AutoTokenizer
83
+ from src.models.sentiment_classifier import SentimentClassifier
84
+
85
+ # Load model and tokenizer
86
+ model = SentimentClassifier.from_pretrained("YOUR_USERNAME/sentiment-classifier")
87
+ tokenizer = AutoTokenizer.from_pretrained("xlm-roberta-base")
88
+
89
+ # Prepare input
90
+ text = "Your input text here"
91
+ inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
92
+
93
+ # Make prediction
94
+ outputs = model(**inputs)
95
+ predictions = outputs["logits"].argmax(dim=-1)
96
+ ```
97
+
98
+ ## Citation
99
+
100
+ If you use this model, please cite:
101
+
102
+ ```bibtex
103
+ @misc{sentiment_classifier,
104
+ title={Sentiment Classifier},
105
+ author={{Your Name}},
106
+ year={2025},
107
+ publisher={Hugging Face},
108
+ howpublished={{\url{{https://huggingface.co/YOUR_USERNAME/{model_name.lower().replace(' ', '-')}}}}}
109
+ }
110
+ ```
111
+
112
+ ---
113
+
114
+ *This model card was automatically generated with [Claude Code](https://claude.com/claude-code)*