Regulatory Capacity Classifier

A BERT-based multi-label classifier for analyzing regulatory capacities in collaborative learning dialogues.

Model Description

Attribute Value
Base Model bert-base-uncased
Task Multi-label Text Classification
Number of Labels 12
Training Strategy Weighted BCEWithLogitsLoss for class imbalance
Language English
Framework PyTorch + HuggingFace Transformers

Model Performance

Overall Metrics (Validation Set, Threshold=0.5)

Metric Score
F1-Micro 0.6554
F1-Macro 0.4675
Precision (Micro) 0.5600
Recall (Micro) 0.7800
Weighted Avg F1 0.6800

Per-Class Performance

Label Precision Recall F1-Score Support
Cog-Evaluate 0.78 0.77 0.77 104
Cog-Explain 0.25 0.27 0.26 22
Cog-Reason 0.51 0.85 0.64 47
Meta-Monitor 0.64 0.83 0.72 127
Meta-Orient 0.26 0.83 0.40 12
Meta-Plan 0.39 0.73 0.51 15
Meta-Reflect 0.08 0.50 0.13 2
SE-Express 0.55 0.69 0.61 35
SE-Regulate 0.25 0.62 0.36 8
TE-Act 0.27 1.00 0.43 3
TE-Report 0.69 0.89 0.78 72

Cross-Validation Results (5-Fold)

Fold F1-Micro F1-Macro Precision Recall
Fold 1 0.6418 0.4501 0.3781 0.5717
Fold 2 0.6310 0.5032 0.4677 0.5914
Fold 3 0.4904 0.3569 0.2593 0.7470
Fold 4 0.6769 0.5134 0.4460 0.6331
Fold 5 0.6660 0.5211 0.4420 0.6584
Mean 0.6212 0.4689 0.3986 0.6403
Std ±0.0754 ±0.0685 ±0.0848 ±0.0687

Intended Use

This model is designed for analyzing collaborative learning dialogues to identify regulatory capacity categories:

Label Taxonomy

Category Labels Description
Cognitive (Cog-) Evaluate, Explain, Generate, Reason Cognitive processing and reasoning
Metacognitive (Meta-) Monitor, Orient, Plan, Reflect Self-regulation and monitoring
Socio-emotional (SE-) Express, Regulate Social and emotional expressions
Task Execution (TE-) Act, Report Task-related actions and reporting

Training Data

Attribute Session 1 Session 2 Total
Total Samples 1,620 1,082 2,702
AI-assisted Groups 865 564 1,429
Teams Groups 755 518 1,273
Unique Groups 12 24 36
Avg Text Length 78.25 chars 65.36 chars -
Avg Labels/Sample 1.37 1.83 -

Label Distribution (Session 1)

Label Count Percentage
Meta-Monitor 641 39.57%
Cog-Evaluate 448 27.65%
TE-Report 345 21.30%
Cog-Reason 249 15.37%
SE-Express 148 9.14%
Meta-Orient 108 6.67%
Meta-Plan 108 6.67%
Cog-Explain 93 5.74%
SE-Regulate 33 2.04%
TE-Act 26 1.60%
Meta-Reflect 22 1.36%

Usage

from transformers import BertTokenizer, BertForSequenceClassification
import torch

# Load model and tokenizer
model_name = "your-username/regulatory-capacity-classifier"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name)

# Define labels
labels = [
    'Cog-Evaluate', 'Cog-Explain', 'Cog-Reason',
    'Meta-Monitor', 'Meta-Orient', 'Meta-Plan', 'Meta-Reflect',
    'SE-Express', 'SE-Regulate',
    'TE-Act', 'TE-Report'
]

# Inference function
def predict(text, threshold=0.5):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
    
    with torch.no_grad():
        outputs = model(**inputs)
        probs = torch.sigmoid(outputs.logits)
        predictions = (probs > threshold).int()
    
    predicted_labels = [labels[i] for i in range(len(labels)) if predictions[0][i] == 1]
    confidence = {labels[i]: float(probs[0][i]) for i in range(len(labels))}
    
    return predicted_labels, confidence

# Example usage
text = "I think we should evaluate our approach before moving forward."
predicted, confidence = predict(text)
print(f"Predicted labels: {predicted}")
print(f"Confidence scores: {confidence}")

Training Procedure

Hyperparameters

Parameter Value
Epochs 8
Batch Size 16
Learning Rate 3e-5
Warmup Steps 100
Max Sequence Length 128
Loss Function Weighted BCEWithLogitsLoss
Optimizer AdamW
Train/Val Split 80/20
Random Seed 42

Class Weights (Computed)

Weights computed using formula: pos_weight = (total_samples - positive_samples) / positive_samples

Label Weight
Meta-Reflect 72.64
TE-Act 61.31
SE-Regulate 48.09
Cog-Explain 16.42
Meta-Plan 14.00
Meta-Orient 14.00
SE-Express 9.95
Cog-Reason 5.51
TE-Report 3.70
Cog-Evaluate 2.62
Meta-Monitor 1.53

Hardware

  • Device: Apple Silicon (MPS) / CUDA GPU
  • Training Time: ~10-15 minutes per epoch
  • Total FLOPs: 6.82 × 10^14

Limitations

  1. Domain Specificity: Trained on collaborative learning dialogues; may not generalize to other dialogue types
  2. Class Imbalance: Rare labels (Meta-Reflect, TE-Act) have lower prediction accuracy
  3. Language: English only
  4. Context Length: Maximum 128 tokens; longer texts are truncated
  5. Session Shift: Performance may vary across different learning sessions due to distribution shift

Known Confusion Patterns

True Label Often Confused With Count
Cog-Evaluate Cog-Reason 33
Cog-Evaluate Meta-Monitor 18
Meta-Monitor TE-Report 17
Meta-Monitor Meta-Orient 16
Cog-Reason Cog-Evaluate 14

Ethical Considerations

  • This model is intended for educational research purposes
  • Should not be used as the sole basis for evaluating student performance
  • Human review is recommended for high-stakes applications

Citation

@misc{regulatory-classifier-2026,
  title={Regulatory Capacity Classifier for Collaborative Learning Dialogues},
  author={Anonymous},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/your-username/regulatory-capacity-classifier},
  note={Multi-label BERT classifier trained on 2,702 annotated utterances}
}

Model Card Authors

Generated on 2026-01-24


Files Included

File Description
model.safetensors Model weights (SafeTensors format)
config.json Model configuration
vocab.txt BERT vocabulary
tokenizer_config.json Tokenizer configuration
special_tokens_map.json Special tokens mapping
example_usage.py Usage example script
Downloads last month
7
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results