tkbarb10 commited on
Commit
9b1ba8d
·
verified ·
1 Parent(s): faac1a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +132 -9
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  library_name: transformers
3
- license: apache-2.0
4
  base_model: bert-base-uncased
5
  tags:
6
  - generated_from_trainer
@@ -9,34 +9,155 @@ metrics:
9
  model-index:
10
  - name: experiment_labels_bert_base
11
  results: []
 
 
 
 
 
12
  ---
13
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
  should probably proofread and complete it, then remove this comment. -->
16
 
17
- # experiment_labels_bert_base
 
 
 
18
 
19
- This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
  - Loss: 0.6531
22
  - Accuracy: 0.7444
23
- - F1 Macro: 0.7295
24
  - F1 Weighted: 0.7451
25
 
26
  ## Model description
27
 
28
- More information needed
 
 
29
 
30
  ## Intended uses & limitations
31
 
32
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
 
34
  ## Training and evaluation data
35
 
36
- More information needed
37
 
38
  ## Training procedure
39
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
  ### Training hyperparameters
41
 
42
  The following hyperparameters were used during training:
@@ -44,7 +165,7 @@ The following hyperparameters were used during training:
44
  - train_batch_size: 32
45
  - eval_batch_size: 64
46
  - seed: 42
47
- - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: linear
49
  - lr_scheduler_warmup_steps: 300
50
  - num_epochs: 2
@@ -52,6 +173,8 @@ The following hyperparameters were used during training:
52
 
53
  ### Training results
54
 
 
 
55
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 Macro | F1 Weighted |
56
  |:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:-----------:|
57
  | 0.6645 | 1.0 | 1540 | 0.6703 | 0.7275 | 0.7134 | 0.7292 |
@@ -63,4 +186,4 @@ The following hyperparameters were used during training:
63
  - Transformers 5.0.0
64
  - Pytorch 2.10.0+cu128
65
  - Datasets 4.0.0
66
- - Tokenizers 0.22.2
 
1
  ---
2
  library_name: transformers
3
+ license: mit
4
  base_model: bert-base-uncased
5
  tags:
6
  - generated_from_trainer
 
9
  model-index:
10
  - name: experiment_labels_bert_base
11
  results: []
12
+ datasets:
13
+ - ADS509/full_experiment_labels
14
+ language:
15
+ - en
16
+ pipeline_tag: text-classification
17
  ---
18
 
19
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
20
  should probably proofread and complete it, then remove this comment. -->
21
 
22
+ # Experiment_labels_bert_base
23
+
24
+ This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on dataset consisting of social media comments
25
+ from 5 separate sources.
26
 
 
27
  It achieves the following results on the evaluation set:
28
  - Loss: 0.6531
29
  - Accuracy: 0.7444
30
+ - **F1 Macro: 0.7295**
31
  - F1 Weighted: 0.7451
32
 
33
  ## Model description
34
 
35
+ We retrained the classification layer of Bert Base for a multi-label classification task on our self-labeled data. The model description
36
+ of the base model can be found at the link above and the description of the dataset can be found [here](ADS509/full_experiment_labels). The
37
+ fine-tuning parameters are listed below. This model was the inital model used in our experiment to see if there was any promise in our self-labeling approach
38
 
39
  ## Intended uses & limitations
40
 
41
+ Intended use for this model is to better understand the nature of different social media websites and the nature of the discourse on that site
42
+ beyond the usual "positive", "negative", "neutral" sentiment of most models. The labels for the commentary data are as follows:
43
+
44
+ - Argumentative
45
+ - Opinion
46
+ - Informational
47
+ - Expressive
48
+ - Neutral
49
+
50
+ We think there is promise in this approach, and as this is the initial step towards a deeper understanding of social commentary, there are
51
+ several limitations to outline
52
+
53
+ - As there were a total of 70k records, data was primarily labeled by language models, with the prompt including correctly labeled examples and
54
+ incorrectly labeled examples with the correct label. Three language models were tasked with labeling, and only the majority vote labels were
55
+ kept. Three-way tie samples were set aside. Future iterations would benefit from more models labeling, and more human labeled examples
56
+ - When reviewing records were ambiguous or that the classifier incorrectly predicted, it was clear that the labeling scheme is fuzzy in some instances.
57
+ For instance, many "Opinion" comments can be viewed as "Expressive" "Arguments", leading to ambiguous labeling from models. It would be worth
58
+ exploring a more nuanced labeling scheme, perhaps splitting "Expressive" into 2-3 labels and Opinion into another 1 or 2
59
+ - Due to the nature of the project, the commentary data used for training was subject to the following limitations
60
+ - Queries were isolated to "politics" or "US politics"
61
+ - With one exception, all comment data is dated from Jan 1, 2026 to Feb 12, 2026
62
+ - We set a ceiling and a floor for number of comments per post. No posts with under 10 comments were used, and for posts with several comments,
63
+ we only pulled the most recent 300
64
+
65
 
66
  ## Training and evaluation data
67
 
68
+ A full description of the data can be found [here](ADS509/full_experiment_labels)
69
 
70
  ## Training procedure
71
 
72
+ The full code used for training is below
73
+
74
+ ```
75
+ tokenizer = AutoTokenizer.from_pretrained("bert-base_uncased")
76
+
77
+ # Function to tokenize data with
78
+ def tokenize_function(batch):
79
+ return tokenizer(
80
+ batch['text'],
81
+ truncation=True,
82
+ max_length=512 # Can't be greater than model max length
83
+ )
84
+
85
+ # Tokenize Data
86
+ train_data = dataset['train'].map(tokenize_function, batched=True)
87
+ test_data = dataset['test'].map(tokenize_function, batched=True)
88
+ valid_data = dataset['valid'].map(tokenize_function, batched=True)
89
+
90
+ # Convert lists to tensors
91
+ train_data.set_format("torch", columns=['input_ids', "attention_mask", "label"])
92
+ test_data.set_format("torch", columns=['input_ids', "attention_mask", "label"])
93
+ valid_data.set_format("torch", columns=['input_ids', "attention_mask", "label"])
94
+
95
+ model = AutoModelForSequenceClassification.from_pretrained(
96
+ MODEL_ID,
97
+ num_labels=5, # adjust this based on number of labels you're training on
98
+ device_map='cuda',
99
+ dtype='auto',
100
+ label2id=label2id,
101
+ id2label=id2label
102
+ )
103
+
104
+ # Metric function for evaluation in Trainer
105
+ def compute_metrics(eval_pred):
106
+ predictions, labels = eval_pred
107
+ predictions = np.argmax(predictions, axis=1)
108
+
109
+ return {
110
+ 'accuracy': accuracy_score(labels, predictions),
111
+ 'f1_macro': f1_score(labels, predictions, average='macro'),
112
+ 'f1_weighted': f1_score(labels, predictions, average='weighted')
113
+ }
114
+
115
+ # Data collator to handle padding dynamically per batch
116
+ data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
117
+
118
+ training_args = TrainingArguments(
119
+ output_dir='./bert-comment',
120
+ num_train_epochs=2,
121
+ per_device_train_batch_size=32,
122
+ per_device_eval_batch_size=64,
123
+ learning_rate=2e-5,
124
+ weight_decay=0.01,
125
+ warmup_steps=200,
126
+
127
+ # Evaluation & saving
128
+ eval_strategy='epoch',
129
+ save_strategy='epoch',
130
+ load_best_model_at_end=True,
131
+ metric_for_best_model='f1_macro',
132
+
133
+ # Logging
134
+ logging_steps=100,
135
+ report_to='tensorboard',
136
+
137
+ # Other
138
+ seed=42,
139
+ fp16=torch.cuda.is_available(), # Mixed precision if GPU available
140
+ )
141
+
142
+ # Set up Trainer
143
+ trainer = Trainer(
144
+ model=model,
145
+ args=training_args,
146
+ train_dataset=train_data,
147
+ eval_dataset=valid_data,
148
+ processing_class=tokenizer,
149
+ data_collator=data_collator,
150
+ compute_metrics=compute_metrics
151
+ )
152
+
153
+ # Train!
154
+ trainer.train()
155
+
156
+ # Evaluate
157
+ eval_results = trainer.evaluate()
158
+ print(eval_results)
159
+ ```
160
+
161
  ### Training hyperparameters
162
 
163
  The following hyperparameters were used during training:
 
165
  - train_batch_size: 32
166
  - eval_batch_size: 64
167
  - seed: 42
168
+ - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08
169
  - lr_scheduler_type: linear
170
  - lr_scheduler_warmup_steps: 300
171
  - num_epochs: 2
 
173
 
174
  ### Training results
175
 
176
+ As this is a multi-label classification problem and there is class imbalance, the main metric we evaluate this model by is `f1_macro`
177
+
178
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 Macro | F1 Weighted |
179
  |:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:-----------:|
180
  | 0.6645 | 1.0 | 1540 | 0.6703 | 0.7275 | 0.7134 | 0.7292 |
 
186
  - Transformers 5.0.0
187
  - Pytorch 2.10.0+cu128
188
  - Datasets 4.0.0
189
+ - Tokenizers 0.22.2