KingTechnician
/

bert-osmosis-coverage

+---
+language: en
+license: apache-2.0
+tags:
+- education
+- coverage-assessment
+- bert
+- regression
+- domain-agnostic
+- educational-ai
+datasets:
+- synthetic-educational-conversations
+metrics:
+- pearson_correlation
+- mae
+- r_squared
+model-index:
+- name: BERT Coverage Assessment
+  results:
+  - task:
+      type: regression
+      name: Educational Coverage Assessment
+    metrics:
+    - type: pearson_correlation
+      value: 0.865
+      name: Pearson Correlation
+    - type: r_squared
+      value: 0.749
+      name: R-squared
+    - type: mae
+      value: 0.133
+      name: Mean Absolute Error
+---
+# BERT Coverage Assessment Model
+🎯 **A domain-agnostic BERT model for assessing educational conversation coverage**
+## Model Description
+This model fine-tunes BERT for educational coverage assessment, predicting how well student conversations address learning objectives. It achieves **0.865 Pearson correlation** with coverage assessments, making it suitable for real-time educational applications.
+## Key Features
+- 🌍 **Domain-agnostic**: Works across subjects without retraining
+- 📊 **Continuous scoring**: Outputs 0.0-1.0 coverage scores
+- ⚡ **Real-time capable**: Fast inference for live systems
+- 🎓 **Research-validated**: Exceeds academic benchmarks
+## Performance
+| Metric | Value |
+|--------|-------|
+| Pearson Correlation | 0.8650 |
+| R-squared | 0.7490 |
+| Mean Absolute Error | 0.1330 |
+| RMSE | 0.165 |
+## Usage
+```python
+from transformers import AutoTokenizer
+import torch
+import torch.nn as nn
+from transformers import AutoModel
+class BERTCoverageRegressor(nn.Module):
+    def __init__(self, model_name='bert-base-uncased', dropout_rate=0.3):
+        super(BERTCoverageRegressor, self).__init__()
+        self.bert = AutoModel.from_pretrained(model_name)
+        self.dropout = nn.Dropout(dropout_rate)
+        self.regressor = nn.Linear(self.bert.config.hidden_size, 1)
+    def forward(self, input_ids, attention_mask):
+        outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask)
+        pooled_output = outputs.pooler_output
+        output = self.dropout(pooled_output)
+        return self.regressor(output)
+# Load model and tokenizer
+tokenizer = AutoTokenizer.from_pretrained('KingTechnician/bert-osmosis-coverage')
+model = BERTCoverageRegressor()
+# Load the fine-tuned weights
+model_path = "pytorch_model.bin"  # Download from repo
+model.load_state_dict(torch.load(model_path, map_location='cpu'))
+model.eval()
+# Make prediction
+def predict_coverage(objective, conversation, max_length=512):
+    encoding = tokenizer(
+        objective,
+        conversation,
+        truncation=True,
+        padding='max_length',
+        max_length=max_length,
+        return_tensors='pt'
+    )
+    with torch.no_grad():
+        output = model(encoding['input_ids'], encoding['attention_mask'])
+        score = torch.clamp(output.squeeze(), 0.0, 1.0).item()
+    return score
+# Example usage
+objective = "Understand the process of photosynthesis"
+conversation = "Student explains light reactions and Calvin cycle with examples..."
+coverage_score = predict_coverage(objective, conversation)
+print(f"Coverage Score: {coverage_score:.3f}")
+```
+## Input Format
+The model expects input in the format:
+```
+[CLS] learning_objective [SEP] student_conversation [SEP]
+```
+## Output
+Returns a continuous score between 0.0 and 1.0:
+- **0.0-0.2**: Minimal coverage
+- **0.3-0.4**: Low coverage
+- **0.5-0.6**: Moderate coverage
+- **0.7-0.8**: High coverage
+- **0.9-1.0**: Complete coverage
+## Training Data
+Trained on synthetic educational conversations across multiple domains:
+- Computer Science (algorithms, data structures)
+- Statistics (hypothesis testing, regression)
+- Multi-domain conversations
+## Research Background
+This model implements the methodology from research on domain-agnostic educational assessment, achieving significant improvements over traditional similarity-based approaches:
+- **269% improvement** over baseline similarity features
+- **Domain transfer capability** without retraining
+- **Real-time processing** under 100ms per assessment
+## Limitations
+- Trained primarily on synthetic data (validation on real conversations recommended)
+- Optimized for English language conversations
+- Performance may vary for highly specialized technical domains
+## Citation
+If you use this model in your research, please cite:
+```bibtex
+@misc{bert-coverage-assessment,
+  title={Domain-Agnostic Coverage Assessment Through BERT Fine-tuning},
+  author={Your Name},
+  year={2025},
+  url={https://huggingface.co/KingTechnician/bert-osmosis-coverage}
+}
+```
+## Contact
+For questions or collaborations, please open an issue in the model repository.
+---
+**Model Type**: Educational AI | **Task**: Coverage Assessment | **Performance**: r=0.865