File size: 3,865 Bytes
1ff3625
a65a996
1ff3625
a65a996
1ff3625
 
 
a65a996
1ff3625
a65a996
1ff3625
 
 
 
 
 
 
a65a996
 
 
 
1ff3625
 
a65a996
1ff3625
a65a996
1ff3625
 
 
a65a996
 
1ff3625
a65a996
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1ff3625
 
 
a65a996
 
 
 
 
 
1ff3625
 
 
 
a65a996
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1ff3625
 
a65a996
 
 
 
1ff3625
 
a65a996
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
---
license: mit
library_name: coreml
base_model: FacebookAI/xlm-roberta-base
tags:
  - text-classification
  - sentiment-analysis
  - named-entity-recognition
  - multi-label-classification
  - multi-task
  - coreml
  - russian
  - xlm-roberta
language:
  - ru
  - en
pipeline_tag: text-classification
datasets:
  - RuSentiment
  - google/goEmotions
  - wikiann
---

# XLM-RoBERTa Multi-Head Classifier (Fine-Tuned) — CoreML

Fine-tuned CoreML version of XLM-RoBERTa-base with three classification heads for on-device multilingual text analysis on Apple Silicon. Performs sentiment analysis, multi-label tagging, and named entity recognition in a single forward pass.

## Model Details

- **Architecture:** XLM-RoBERTa-base (12 layers, 768 hidden, 12 heads) + 3 task-specific heads
- **Format:** CoreML `.mlpackage` (mlprogram)
- **Variants:** FP16 (529 MB), INT8 (266 MB)
- **Sequence length:** 128 tokens
- **Input:** Tokenized text (`input_ids` + `attention_mask`, int32)
- **Output:** Three tensors — sentiment logits, tag logits, NER logits

## Heads

### Sentiment (4 classes)
Single-label classification: `positive`, `neutral`, `risk`, `toxic`

### Tags (20 multi-label)
Multi-label phrase tagging: `stress_signal`, `confidence`, `emotional_state`, `trust_indicator`, `defensiveness`, `active_listening`, `rapport_building`, `conflict_signal`, `cooperation`, `clarification`, `pressure_tactic`, `concession`, `information_sharing`, `commitment`, `deadline_mention`, `deception_signal`, `manipulation`, `power_dynamic`, `agreement`, `problem_solving`

### NER (9 BIO labels, per-token)
Named entity recognition: `O`, `B-PER`, `I-PER`, `B-ORG`, `I-ORG`, `B-MONEY`, `I-MONEY`, `B-DATE`, `I-DATE`

## Training

- **Backbone LR:** 2e-5 with cosine decay + 10% linear warmup
- **Head LR:** 2e-4 (10x backbone)
- **Epochs:** 3 (best checkpoint at epoch 2)
- **Batch size:** 32
- **Device:** Apple M3 Max (MPS)
- **Training data:** RuSentiment (50K), GoEmotions mapped to 4-class (50K), WikiAnn-ru NER (50K), synthetic negotiation tags (1.7K)
- **Multi-task loss:** weighted CE (sentiment) + BCE (tags) + CE with ignore_index (NER)

## Metrics (Validation)

| Head | Metric | Value |
|------|--------|-------|
| Sentiment | Accuracy | **76.6%** |
| Tags | F1 (macro) | **57.0%** |
| NER | Accuracy | **94.8%** |
| Combined | Val Loss | 0.492 |

## Model Files

| File | Size | Description |
|------|------|-------------|
| `XLMRobertaMultiHead.mlpackage/` | 529 MB | FP16 model |
| `XLMRobertaMultiHead_INT8.mlpackage/` | 266 MB | INT8 quantized (recommended) |
| `label_definitions.json` | 1 KB | Label mappings for all heads |
| `config.json` | — | Model configuration |

## Usage

```swift
import CoreML

let model = try MLModel(contentsOf: modelURL)

// Prepare inputs (use XLM-RoBERTa tokenizer)
let inputArray = try MLMultiArray(shape: [1, 128], dataType: .int32)
let maskArray = try MLMultiArray(shape: [1, 128], dataType: .int32)
// ... fill with tokenized text ...

let input = try MLDictionaryFeatureProvider(dictionary: [
    "input_ids": MLFeatureValue(multiArray: inputArray),
    "attention_mask": MLFeatureValue(multiArray: maskArray)
])

let output = try model.prediction(from: input)
let sentimentLogits = output.featureValue(for: "sentiment_logits")!.multiArrayValue!
let tagLogits = output.featureValue(for: "tag_logits")!.multiArrayValue!
let nerLogits = output.featureValue(for: "ner_logits")!.multiArrayValue!
```

## Tokenizer

This model uses the standard XLM-RoBERTa tokenizer from `FacebookAI/xlm-roberta-base`. CoreML does not include the tokenizer — use `tokenizers` library or bundle `tokenizer.json` separately.

## Attribution

Base model [XLM-RoBERTa](https://huggingface.co/FacebookAI/xlm-roberta-base) by Facebook AI. Fine-tuning on Russian/English datasets and CoreML conversion by [@smkrv](https://huggingface.co/smkrv).