margretmeng1020 commited on
Commit
bc6b956
·
verified ·
1 Parent(s): e180c4f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +268 -3
README.md CHANGED
@@ -1,3 +1,268 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ tags:
6
+ - text-classification
7
+ - multi-label-classification
8
+ - regulatory-capacity
9
+ - collaborative-learning
10
+ - bert
11
+ - education
12
+ - nlp
13
+ datasets:
14
+ - custom
15
+ metrics:
16
+ - f1
17
+ - precision
18
+ - recall
19
+ pipeline_tag: text-classification
20
+ model-index:
21
+ - name: regulatory-capacity-classifier
22
+ results:
23
+ - task:
24
+ type: text-classification
25
+ name: Multi-label Text Classification
26
+ metrics:
27
+ - name: F1-Micro
28
+ type: f1
29
+ value: 0.6554
30
+ - name: F1-Macro
31
+ type: f1
32
+ value: 0.4675
33
+ - name: Precision (Micro)
34
+ type: precision
35
+ value: 0.5600
36
+ - name: Recall (Micro)
37
+ type: recall
38
+ value: 0.7800
39
+ ---
40
+
41
+ # Regulatory Capacity Classifier
42
+
43
+ A BERT-based multi-label classifier for analyzing regulatory capacities in collaborative learning dialogues.
44
+
45
+ ## Model Description
46
+
47
+ | Attribute | Value |
48
+ |-----------|-------|
49
+ | **Base Model** | `bert-base-uncased` |
50
+ | **Task** | Multi-label Text Classification |
51
+ | **Number of Labels** | 12 |
52
+ | **Training Strategy** | Weighted BCEWithLogitsLoss for class imbalance |
53
+ | **Language** | English |
54
+ | **Framework** | PyTorch + HuggingFace Transformers |
55
+
56
+ ## Model Performance
57
+
58
+ ### Overall Metrics (Validation Set, Threshold=0.5)
59
+
60
+ | Metric | Score |
61
+ |--------|-------|
62
+ | **F1-Micro** | 0.6554 |
63
+ | **F1-Macro** | 0.4675 |
64
+ | **Precision (Micro)** | 0.5600 |
65
+ | **Recall (Micro)** | 0.7800 |
66
+ | **Weighted Avg F1** | 0.6800 |
67
+
68
+ ### Per-Class Performance
69
+
70
+ | Label | Precision | Recall | F1-Score | Support |
71
+ |-------|-----------|--------|----------|---------|
72
+ | Cog-Evaluate | 0.78 | 0.77 | **0.77** | 104 |
73
+ | Cog-Explain | 0.25 | 0.27 | 0.26 | 22 |
74
+ | Cog-Reason | 0.51 | 0.85 | **0.64** | 47 |
75
+ | Meta-Monitor | 0.64 | 0.83 | **0.72** | 127 |
76
+ | Meta-Orient | 0.26 | 0.83 | 0.40 | 12 |
77
+ | Meta-Plan | 0.39 | 0.73 | 0.51 | 15 |
78
+ | Meta-Reflect | 0.08 | 0.50 | 0.13 | 2 |
79
+ | SE-Express | 0.55 | 0.69 | **0.61** | 35 |
80
+ | SE-Regulate | 0.25 | 0.62 | 0.36 | 8 |
81
+ | TE-Act | 0.27 | 1.00 | 0.43 | 3 |
82
+ | TE-Report | 0.69 | 0.89 | **0.78** | 72 |
83
+
84
+ ### Cross-Validation Results (5-Fold)
85
+
86
+ | Fold | F1-Micro | F1-Macro | Precision | Recall |
87
+ |------|----------|----------|-----------|--------|
88
+ | Fold 1 | 0.6418 | 0.4501 | 0.3781 | 0.5717 |
89
+ | Fold 2 | 0.6310 | 0.5032 | 0.4677 | 0.5914 |
90
+ | Fold 3 | 0.4904 | 0.3569 | 0.2593 | 0.7470 |
91
+ | Fold 4 | 0.6769 | 0.5134 | 0.4460 | 0.6331 |
92
+ | Fold 5 | 0.6660 | 0.5211 | 0.4420 | 0.6584 |
93
+ | **Mean** | **0.6212** | **0.4689** | **0.3986** | **0.6403** |
94
+ | Std | ±0.0754 | ±0.0685 | ±0.0848 | ±0.0687 |
95
+
96
+ ## Intended Use
97
+
98
+ This model is designed for analyzing collaborative learning dialogues to identify regulatory capacity categories:
99
+
100
+ ### Label Taxonomy
101
+
102
+ | Category | Labels | Description |
103
+ |----------|--------|-------------|
104
+ | **Cognitive (Cog-)** | Evaluate, Explain, Generate, Reason | Cognitive processing and reasoning |
105
+ | **Metacognitive (Meta-)** | Monitor, Orient, Plan, Reflect | Self-regulation and monitoring |
106
+ | **Socio-emotional (SE-)** | Express, Regulate | Social and emotional expressions |
107
+ | **Task Execution (TE-)** | Act, Report | Task-related actions and reporting |
108
+
109
+ ## Training Data
110
+
111
+ | Attribute | Session 1 | Session 2 | Total |
112
+ |-----------|-----------|-----------|-------|
113
+ | **Total Samples** | 1,620 | 1,082 | **2,702** |
114
+ | **AI-assisted Groups** | 865 | 564 | 1,429 |
115
+ | **Teams Groups** | 755 | 518 | 1,273 |
116
+ | **Unique Groups** | 12 | 24 | 36 |
117
+ | **Avg Text Length** | 78.25 chars | 65.36 chars | - |
118
+ | **Avg Labels/Sample** | 1.37 | 1.83 | - |
119
+
120
+ ### Label Distribution (Session 1)
121
+
122
+ | Label | Count | Percentage |
123
+ |-------|-------|------------|
124
+ | Meta-Monitor | 641 | 39.57% |
125
+ | Cog-Evaluate | 448 | 27.65% |
126
+ | TE-Report | 345 | 21.30% |
127
+ | Cog-Reason | 249 | 15.37% |
128
+ | SE-Express | 148 | 9.14% |
129
+ | Meta-Orient | 108 | 6.67% |
130
+ | Meta-Plan | 108 | 6.67% |
131
+ | Cog-Explain | 93 | 5.74% |
132
+ | SE-Regulate | 33 | 2.04% |
133
+ | TE-Act | 26 | 1.60% |
134
+ | Meta-Reflect | 22 | 1.36% |
135
+
136
+ ## Usage
137
+
138
+ ```python
139
+ from transformers import BertTokenizer, BertForSequenceClassification
140
+ import torch
141
+
142
+ # Load model and tokenizer
143
+ model_name = "your-username/regulatory-capacity-classifier"
144
+ tokenizer = BertTokenizer.from_pretrained(model_name)
145
+ model = BertForSequenceClassification.from_pretrained(model_name)
146
+
147
+ # Define labels
148
+ labels = [
149
+ 'Cog-Evaluate', 'Cog-Explain', 'Cog-Reason',
150
+ 'Meta-Monitor', 'Meta-Orient', 'Meta-Plan', 'Meta-Reflect',
151
+ 'SE-Express', 'SE-Regulate',
152
+ 'TE-Act', 'TE-Report'
153
+ ]
154
+
155
+ # Inference function
156
+ def predict(text, threshold=0.5):
157
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
158
+
159
+ with torch.no_grad():
160
+ outputs = model(**inputs)
161
+ probs = torch.sigmoid(outputs.logits)
162
+ predictions = (probs > threshold).int()
163
+
164
+ predicted_labels = [labels[i] for i in range(len(labels)) if predictions[0][i] == 1]
165
+ confidence = {labels[i]: float(probs[0][i]) for i in range(len(labels))}
166
+
167
+ return predicted_labels, confidence
168
+
169
+ # Example usage
170
+ text = "I think we should evaluate our approach before moving forward."
171
+ predicted, confidence = predict(text)
172
+ print(f"Predicted labels: {predicted}")
173
+ print(f"Confidence scores: {confidence}")
174
+ ```
175
+
176
+ ## Training Procedure
177
+
178
+ ### Hyperparameters
179
+
180
+ | Parameter | Value |
181
+ |-----------|-------|
182
+ | **Epochs** | 8 |
183
+ | **Batch Size** | 16 |
184
+ | **Learning Rate** | 3e-5 |
185
+ | **Warmup Steps** | 100 |
186
+ | **Max Sequence Length** | 128 |
187
+ | **Loss Function** | Weighted BCEWithLogitsLoss |
188
+ | **Optimizer** | AdamW |
189
+ | **Train/Val Split** | 80/20 |
190
+ | **Random Seed** | 42 |
191
+
192
+ ### Class Weights (Computed)
193
+
194
+ Weights computed using formula: `pos_weight = (total_samples - positive_samples) / positive_samples`
195
+
196
+ | Label | Weight |
197
+ |-------|--------|
198
+ | Meta-Reflect | 72.64 |
199
+ | TE-Act | 61.31 |
200
+ | SE-Regulate | 48.09 |
201
+ | Cog-Explain | 16.42 |
202
+ | Meta-Plan | 14.00 |
203
+ | Meta-Orient | 14.00 |
204
+ | SE-Express | 9.95 |
205
+ | Cog-Reason | 5.51 |
206
+ | TE-Report | 3.70 |
207
+ | Cog-Evaluate | 2.62 |
208
+ | Meta-Monitor | 1.53 |
209
+
210
+ ### Hardware
211
+
212
+ - **Device**: Apple Silicon (MPS) / CUDA GPU
213
+ - **Training Time**: ~10-15 minutes per epoch
214
+ - **Total FLOPs**: 6.82 × 10^14
215
+
216
+ ## Limitations
217
+
218
+ 1. **Domain Specificity**: Trained on collaborative learning dialogues; may not generalize to other dialogue types
219
+ 2. **Class Imbalance**: Rare labels (Meta-Reflect, TE-Act) have lower prediction accuracy
220
+ 3. **Language**: English only
221
+ 4. **Context Length**: Maximum 128 tokens; longer texts are truncated
222
+ 5. **Session Shift**: Performance may vary across different learning sessions due to distribution shift
223
+
224
+ ### Known Confusion Patterns
225
+
226
+ | True Label | Often Confused With | Count |
227
+ |------------|---------------------|-------|
228
+ | Cog-Evaluate | Cog-Reason | 33 |
229
+ | Cog-Evaluate | Meta-Monitor | 18 |
230
+ | Meta-Monitor | TE-Report | 17 |
231
+ | Meta-Monitor | Meta-Orient | 16 |
232
+ | Cog-Reason | Cog-Evaluate | 14 |
233
+
234
+ ## Ethical Considerations
235
+
236
+ - This model is intended for educational research purposes
237
+ - Should not be used as the sole basis for evaluating student performance
238
+ - Human review is recommended for high-stakes applications
239
+
240
+ ## Citation
241
+
242
+ ```bibtex
243
+ @misc{regulatory-classifier-2026,
244
+ title={Regulatory Capacity Classifier for Collaborative Learning Dialogues},
245
+ author={Anonymous},
246
+ year={2026},
247
+ publisher={Hugging Face},
248
+ url={https://huggingface.co/your-username/regulatory-capacity-classifier},
249
+ note={Multi-label BERT classifier trained on 2,702 annotated utterances}
250
+ }
251
+ ```
252
+
253
+ ## Model Card Authors
254
+
255
+ Generated on 2026-01-24
256
+
257
+ ---
258
+
259
+ ## Files Included
260
+
261
+ | File | Description |
262
+ |------|-------------|
263
+ | `model.safetensors` | Model weights (SafeTensors format) |
264
+ | `config.json` | Model configuration |
265
+ | `vocab.txt` | BERT vocabulary |
266
+ | `tokenizer_config.json` | Tokenizer configuration |
267
+ | `special_tokens_map.json` | Special tokens mapping |
268
+ | `example_usage.py` | Usage example script |