IT vs Non-IT TinyBERT Classifier, R98

Binary vacancy classifier for the first-stage gate: IT role vs Non-IT role.

This version is tuned for target recall IT >= 0.98, with a lower learning rate than the R99 baseline.

Model

  • Base model: cointegrated/rubert-tiny2
  • Architecture: BertForSequenceClassification
  • Labels:
    • 0: NonIT
    • 1: IT
  • Input text: title + " . " + description
  • Description truncation: 2000 characters
  • Max sequence length: 384

Training Data

Dataset: data/labeled/it_nonit_train_28_05_v2.csv

Rows:

Source Rows
old gold labels 7,085
OpenAI-labeled TF-IDF candidates 13,445
OpenAI-labeled TF-IDF deferred non-IT 6,617
Total 27,147

Binary balance:

Class Rows
IT 9,836
Non-IT 17,311

Training Setup

Command:

python3 classifier_agent/train_it_bert.py \
  --input data/labeled/it_nonit_train_28_05_v2.csv \
  --output-dir classifier_agent/it_vs_nonit_tiny_r98_lr2e5 \
  --device mps \
  --epochs 6 \
  --patience 2 \
  --batch-size 32 \
  --eval-batch-size 64 \
  --lr 2e-5 \
  --max-len 384 \
  --desc-limit 2000 \
  --target-recall 0.98 \
  --pos-weight-mult 1.0

Best checkpoint:

epoch 3

Validation Metrics

Threshold selected for target recall IT >= 0.98.

Metric Value
ROC-AUC 0.9952
Threshold 0.2934
Precision IT 0.9089
Recall IT 0.9804
F1 IT 0.9433

Confusion matrix at threshold 0.2934:

rows=true, cols=pred [NonIT, IT]

[[2452,  145],
 [  29, 1447]]

Error Analysis

False negatives: 29.

Main FN categories:

Category Count
Project Manager 6
Product Manager 6
Support / Сисадмин 5
Data Analyst 5
HR / Рекрутер 2
Mobile 1
ИБ / Security 1
Системный аналитик 1
Дизайнер 1
Embedded 1

False positives: 145, all labeled Не IT.

Compared with the R99 baseline, this version trades recall for precision:

Model Precision IT Recall IT F1 IT FN FP
it_vs_nonit_tiny R99 0.8739 0.9905 0.9285 14 211
it_vs_nonit_tiny_r98_lr2e5 0.9089 0.9804 0.9433 29 145

Usage

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

MODEL_DIR = "classifier_agent/it_vs_nonit_tiny_r98_lr2e5"
THRESHOLD = 0.2934192717075348

tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
model = AutoModelForSequenceClassification.from_pretrained(MODEL_DIR).eval()

def is_it(title: str, description: str = "") -> tuple[bool, float]:
    text = f"{title.strip()} . {' '.join(description.split())[:2000]}"
    enc = tokenizer(text, truncation=True, max_length=384, return_tensors="pt")
    with torch.no_grad():
        logits = model(**enc).logits
    proba_it = torch.softmax(logits, dim=-1)[0, 1].item()
    return proba_it >= THRESHOLD, proba_it

Recommendation

Use this version when reducing false positives is more important than catching the last 1% of weak or ambiguous IT roles. Keep the R99 model as a broader safety gate.

Downloads last month
21
Safetensors
Model size
29.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AndreiTolmachev/it-vs-nonit-roles-tiny

Finetuned
(70)
this model

Evaluation results