brick-modernbert-capability-classifier

File size: 3,542 Bytes

dcca4b4

---
base_model: answerdotai/ModernBERT-base
library_name: transformers
license: apache-2.0
datasets:
  - massaindustries/dataset-B-modernbert-train
tags:
  - text-classification
  - multi-label
  - modernbert
  - capability-classifier
  - routing
---

# ModernBERT capability classifier (6 dimensions)

Fine-tuned on [`massaindustries/dataset-B-modernbert-train`](https://huggingface.co/datasets/massaindustries/dataset-B-modernbert-train).
Outputs sigmoid scores in [0,1] over 6 capability dimensions:

1. `instruction_following`
2. `coding`
3. `math_reasoning`
4. `world_knowledge`
5. `planning_agentic`
6. `creative_synthesis`

Designed for downstream routing in the Brick semantic router as a drop-in replacement for the domain classifier.

## Training
- Architecture: ModernBERT + Linear(hidden→6) + sigmoid
- Loss: BCEWithLogitsLoss on soft float labels (judge mean)
- Precision: bf16 + FlashAttention-2
- HF problem_type: `multi_label_classification`

## Inference example
```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
m = AutoModelForSequenceClassification.from_pretrained('massaindustries/modernbert-capability-classifier')
t = AutoTokenizer.from_pretrained('massaindustries/modernbert-capability-classifier')
inp = t('write a python sort function', return_tensors='pt')
scores = torch.sigmoid(m(**inp).logits)[0]
for i, d in enumerate(m.config.id2label.values()):
    print(f'{d}: {scores[i].item():.3f}')
```

## Evaluation (human_eval split, 200 Claude-annotated)
```json
{
  "eval_loss": 0.42123839259147644,
  "eval_model_preparation_time": 0.0022,
  "eval_mae_instruction_following": 0.24792593717575073,
  "eval_rmse_instruction_following": 0.30881765484809875,
  "eval_brier_instruction_following": 0.09536834806203842,
  "eval_pearson_instruction_following": 0.8270609378814697,
  "eval_spearman_instruction_following": 0.8144904545331433,
  "eval_mae_coding": 0.07370934635400772,
  "eval_rmse_coding": 0.18934082984924316,
  "eval_brier_coding": 0.03584995120763779,
  "eval_pearson_coding": 0.9140766263008118,
  "eval_spearman_coding": 0.8615511297152596,
  "eval_mae_math_reasoning": 0.10867060720920563,
  "eval_rmse_math_reasoning": 0.1694405972957611,
  "eval_brier_math_reasoning": 0.02871011756360531,
  "eval_pearson_math_reasoning": 0.9191069602966309,
  "eval_spearman_math_reasoning": 0.8252107128077218,
  "eval_mae_world_knowledge": 0.13477517664432526,
  "eval_rmse_world_knowledge": 0.1875971555709839,
  "eval_brier_world_knowledge": 0.03519269451498985,
  "eval_pearson_world_knowledge": 0.8357715606689453,
  "eval_spearman_world_knowledge": 0.8138721105892404,
  "eval_mae_planning_agentic": 0.19774200022220612,
  "eval_rmse_planning_agentic": 0.2537391781806946,
  "eval_brier_planning_agentic": 0.06438356637954712,
  "eval_pearson_planning_agentic": 0.8233083486557007,
  "eval_spearman_planning_agentic": 0.7674644757779185,
  "eval_mae_creative_synthesis": 0.08937528729438782,
  "eval_rmse_creative_synthesis": 0.16472801566123962,
  "eval_brier_creative_synthesis": 0.027135320007801056,
  "eval_pearson_creative_synthesis": 0.9154033660888672,
  "eval_spearman_creative_synthesis": 0.8138763391203128,
  "eval_pearson_macro": 0.8724546333154043,
  "eval_mae_macro": 0.14203305914998055,
  "eval_spearman_macro": 0.8160775370905994,
  "eval_f1_macro_t3": 0.8775192561604114,
  "eval_f1_macro_t5": 0.8368971405647821,
  "eval_f1_macro_t7": 0.8287502804667367,
  "eval_runtime": 1.384,
  "eval_samples_per_second": 144.51,
  "eval_steps_p
```