XLM-RoBERTa Multi-Head Classifier (Fine-Tuned) — CoreML
Fine-tuned CoreML version of XLM-RoBERTa-base with three classification heads for on-device multilingual text analysis on Apple Silicon. Performs sentiment analysis, multi-label tagging, and named entity recognition in a single forward pass.
Model Details
- Architecture: XLM-RoBERTa-base (12 layers, 768 hidden, 12 heads) + 3 task-specific heads
- Format: CoreML
.mlpackage(mlprogram) - Variants: FP16 (529 MB), INT8 (266 MB)
- Sequence length: 128 tokens
- Input: Tokenized text (
input_ids+attention_mask, int32) - Output: Three tensors — sentiment logits, tag logits, NER logits
Heads
Sentiment (4 classes)
Single-label classification: positive, neutral, risk, toxic
Tags (20 multi-label)
Multi-label phrase tagging: stress_signal, confidence, emotional_state, trust_indicator, defensiveness, active_listening, rapport_building, conflict_signal, cooperation, clarification, pressure_tactic, concession, information_sharing, commitment, deadline_mention, deception_signal, manipulation, power_dynamic, agreement, problem_solving
NER (9 BIO labels, per-token)
Named entity recognition: O, B-PER, I-PER, B-ORG, I-ORG, B-MONEY, I-MONEY, B-DATE, I-DATE
Training
- Backbone LR: 2e-5 with cosine decay + 10% linear warmup
- Head LR: 2e-4 (10x backbone)
- Epochs: 3 (best checkpoint at epoch 2)
- Batch size: 32
- Device: Apple M3 Max (MPS)
- Training data: RuSentiment (50K), GoEmotions mapped to 4-class (50K), WikiAnn-ru NER (50K), synthetic negotiation tags (1.7K)
- Multi-task loss: weighted CE (sentiment) + BCE (tags) + CE with ignore_index (NER)
Metrics (Validation)
| Head | Metric | Value |
|---|---|---|
| Sentiment | Accuracy | 76.6% |
| Tags | F1 (macro) | 57.0% |
| NER | Accuracy | 94.8% |
| Combined | Val Loss | 0.492 |
Model Files
| File | Size | Description |
|---|---|---|
XLMRobertaMultiHead.mlpackage/ |
529 MB | FP16 model |
XLMRobertaMultiHead_INT8.mlpackage/ |
266 MB | INT8 quantized (recommended) |
label_definitions.json |
1 KB | Label mappings for all heads |
config.json |
— | Model configuration |
Usage
import CoreML
let model = try MLModel(contentsOf: modelURL)
// Prepare inputs (use XLM-RoBERTa tokenizer)
let inputArray = try MLMultiArray(shape: [1, 128], dataType: .int32)
let maskArray = try MLMultiArray(shape: [1, 128], dataType: .int32)
// ... fill with tokenized text ...
let input = try MLDictionaryFeatureProvider(dictionary: [
"input_ids": MLFeatureValue(multiArray: inputArray),
"attention_mask": MLFeatureValue(multiArray: maskArray)
])
let output = try model.prediction(from: input)
let sentimentLogits = output.featureValue(for: "sentiment_logits")!.multiArrayValue!
let tagLogits = output.featureValue(for: "tag_logits")!.multiArrayValue!
let nerLogits = output.featureValue(for: "ner_logits")!.multiArrayValue!
Tokenizer
This model uses the standard XLM-RoBERTa tokenizer from FacebookAI/xlm-roberta-base. CoreML does not include the tokenizer — use tokenizers library or bundle tokenizer.json separately.
Attribution
Base model XLM-RoBERTa by Facebook AI. Fine-tuning on Russian/English datasets and CoreML conversion by @smkrv.
- Downloads last month
- 16
Model tree for smkrv/xlm-roberta-multihead-coreml
Base model
FacebookAI/xlm-roberta-base