Regulatory Capacity Classifier
A BERT-based multi-label classifier for analyzing regulatory capacities in collaborative learning dialogues.
Model Description
| Attribute |
Value |
| Base Model |
bert-base-uncased |
| Task |
Multi-label Text Classification |
| Number of Labels |
12 |
| Training Strategy |
Weighted BCEWithLogitsLoss for class imbalance |
| Language |
English |
| Framework |
PyTorch + HuggingFace Transformers |
Model Performance
Overall Metrics (Validation Set, Threshold=0.5)
| Metric |
Score |
| F1-Micro |
0.6554 |
| F1-Macro |
0.4675 |
| Precision (Micro) |
0.5600 |
| Recall (Micro) |
0.7800 |
| Weighted Avg F1 |
0.6800 |
Per-Class Performance
| Label |
Precision |
Recall |
F1-Score |
Support |
| Cog-Evaluate |
0.78 |
0.77 |
0.77 |
104 |
| Cog-Explain |
0.25 |
0.27 |
0.26 |
22 |
| Cog-Reason |
0.51 |
0.85 |
0.64 |
47 |
| Meta-Monitor |
0.64 |
0.83 |
0.72 |
127 |
| Meta-Orient |
0.26 |
0.83 |
0.40 |
12 |
| Meta-Plan |
0.39 |
0.73 |
0.51 |
15 |
| Meta-Reflect |
0.08 |
0.50 |
0.13 |
2 |
| SE-Express |
0.55 |
0.69 |
0.61 |
35 |
| SE-Regulate |
0.25 |
0.62 |
0.36 |
8 |
| TE-Act |
0.27 |
1.00 |
0.43 |
3 |
| TE-Report |
0.69 |
0.89 |
0.78 |
72 |
Cross-Validation Results (5-Fold)
| Fold |
F1-Micro |
F1-Macro |
Precision |
Recall |
| Fold 1 |
0.6418 |
0.4501 |
0.3781 |
0.5717 |
| Fold 2 |
0.6310 |
0.5032 |
0.4677 |
0.5914 |
| Fold 3 |
0.4904 |
0.3569 |
0.2593 |
0.7470 |
| Fold 4 |
0.6769 |
0.5134 |
0.4460 |
0.6331 |
| Fold 5 |
0.6660 |
0.5211 |
0.4420 |
0.6584 |
| Mean |
0.6212 |
0.4689 |
0.3986 |
0.6403 |
| Std |
±0.0754 |
±0.0685 |
±0.0848 |
±0.0687 |
Intended Use
This model is designed for analyzing collaborative learning dialogues to identify regulatory capacity categories:
Label Taxonomy
| Category |
Labels |
Description |
| Cognitive (Cog-) |
Evaluate, Explain, Generate, Reason |
Cognitive processing and reasoning |
| Metacognitive (Meta-) |
Monitor, Orient, Plan, Reflect |
Self-regulation and monitoring |
| Socio-emotional (SE-) |
Express, Regulate |
Social and emotional expressions |
| Task Execution (TE-) |
Act, Report |
Task-related actions and reporting |
Training Data
| Attribute |
Session 1 |
Session 2 |
Total |
| Total Samples |
1,620 |
1,082 |
2,702 |
| AI-assisted Groups |
865 |
564 |
1,429 |
| Teams Groups |
755 |
518 |
1,273 |
| Unique Groups |
12 |
24 |
36 |
| Avg Text Length |
78.25 chars |
65.36 chars |
- |
| Avg Labels/Sample |
1.37 |
1.83 |
- |
Label Distribution (Session 1)
| Label |
Count |
Percentage |
| Meta-Monitor |
641 |
39.57% |
| Cog-Evaluate |
448 |
27.65% |
| TE-Report |
345 |
21.30% |
| Cog-Reason |
249 |
15.37% |
| SE-Express |
148 |
9.14% |
| Meta-Orient |
108 |
6.67% |
| Meta-Plan |
108 |
6.67% |
| Cog-Explain |
93 |
5.74% |
| SE-Regulate |
33 |
2.04% |
| TE-Act |
26 |
1.60% |
| Meta-Reflect |
22 |
1.36% |
Usage
from transformers import BertTokenizer, BertForSequenceClassification
import torch
model_name = "your-username/regulatory-capacity-classifier"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name)
labels = [
'Cog-Evaluate', 'Cog-Explain', 'Cog-Reason',
'Meta-Monitor', 'Meta-Orient', 'Meta-Plan', 'Meta-Reflect',
'SE-Express', 'SE-Regulate',
'TE-Act', 'TE-Report'
]
def predict(text, threshold=0.5):
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
outputs = model(**inputs)
probs = torch.sigmoid(outputs.logits)
predictions = (probs > threshold).int()
predicted_labels = [labels[i] for i in range(len(labels)) if predictions[0][i] == 1]
confidence = {labels[i]: float(probs[0][i]) for i in range(len(labels))}
return predicted_labels, confidence
text = "I think we should evaluate our approach before moving forward."
predicted, confidence = predict(text)
print(f"Predicted labels: {predicted}")
print(f"Confidence scores: {confidence}")
Training Procedure
Hyperparameters
| Parameter |
Value |
| Epochs |
8 |
| Batch Size |
16 |
| Learning Rate |
3e-5 |
| Warmup Steps |
100 |
| Max Sequence Length |
128 |
| Loss Function |
Weighted BCEWithLogitsLoss |
| Optimizer |
AdamW |
| Train/Val Split |
80/20 |
| Random Seed |
42 |
Class Weights (Computed)
Weights computed using formula: pos_weight = (total_samples - positive_samples) / positive_samples
| Label |
Weight |
| Meta-Reflect |
72.64 |
| TE-Act |
61.31 |
| SE-Regulate |
48.09 |
| Cog-Explain |
16.42 |
| Meta-Plan |
14.00 |
| Meta-Orient |
14.00 |
| SE-Express |
9.95 |
| Cog-Reason |
5.51 |
| TE-Report |
3.70 |
| Cog-Evaluate |
2.62 |
| Meta-Monitor |
1.53 |
Hardware
- Device: Apple Silicon (MPS) / CUDA GPU
- Training Time: ~10-15 minutes per epoch
- Total FLOPs: 6.82 × 10^14
Limitations
- Domain Specificity: Trained on collaborative learning dialogues; may not generalize to other dialogue types
- Class Imbalance: Rare labels (Meta-Reflect, TE-Act) have lower prediction accuracy
- Language: English only
- Context Length: Maximum 128 tokens; longer texts are truncated
- Session Shift: Performance may vary across different learning sessions due to distribution shift
Known Confusion Patterns
| True Label |
Often Confused With |
Count |
| Cog-Evaluate |
Cog-Reason |
33 |
| Cog-Evaluate |
Meta-Monitor |
18 |
| Meta-Monitor |
TE-Report |
17 |
| Meta-Monitor |
Meta-Orient |
16 |
| Cog-Reason |
Cog-Evaluate |
14 |
Ethical Considerations
- This model is intended for educational research purposes
- Should not be used as the sole basis for evaluating student performance
- Human review is recommended for high-stakes applications
Citation
@misc{regulatory-classifier-2026,
title={Regulatory Capacity Classifier for Collaborative Learning Dialogues},
author={Anonymous},
year={2026},
publisher={Hugging Face},
url={https://huggingface.co/your-username/regulatory-capacity-classifier},
note={Multi-label BERT classifier trained on 2,702 annotated utterances}
}
Model Card Authors
Generated on 2026-01-24
Files Included
| File |
Description |
model.safetensors |
Model weights (SafeTensors format) |
config.json |
Model configuration |
vocab.txt |
BERT vocabulary |
tokenizer_config.json |
Tokenizer configuration |
special_tokens_map.json |
Special tokens mapping |
example_usage.py |
Usage example script |