--- license: mit library_name: coreml base_model: FacebookAI/xlm-roberta-base tags: - text-classification - sentiment-analysis - named-entity-recognition - multi-label-classification - multi-task - coreml - russian - xlm-roberta language: - ru - en pipeline_tag: text-classification datasets: - RuSentiment - google/goEmotions - wikiann --- # XLM-RoBERTa Multi-Head Classifier (Fine-Tuned) — CoreML Fine-tuned CoreML version of XLM-RoBERTa-base with three classification heads for on-device multilingual text analysis on Apple Silicon. Performs sentiment analysis, multi-label tagging, and named entity recognition in a single forward pass. ## Model Details - **Architecture:** XLM-RoBERTa-base (12 layers, 768 hidden, 12 heads) + 3 task-specific heads - **Format:** CoreML `.mlpackage` (mlprogram) - **Variants:** FP16 (529 MB), INT8 (266 MB) - **Sequence length:** 128 tokens - **Input:** Tokenized text (`input_ids` + `attention_mask`, int32) - **Output:** Three tensors — sentiment logits, tag logits, NER logits ## Heads ### Sentiment (4 classes) Single-label classification: `positive`, `neutral`, `risk`, `toxic` ### Tags (20 multi-label) Multi-label phrase tagging: `stress_signal`, `confidence`, `emotional_state`, `trust_indicator`, `defensiveness`, `active_listening`, `rapport_building`, `conflict_signal`, `cooperation`, `clarification`, `pressure_tactic`, `concession`, `information_sharing`, `commitment`, `deadline_mention`, `deception_signal`, `manipulation`, `power_dynamic`, `agreement`, `problem_solving` ### NER (9 BIO labels, per-token) Named entity recognition: `O`, `B-PER`, `I-PER`, `B-ORG`, `I-ORG`, `B-MONEY`, `I-MONEY`, `B-DATE`, `I-DATE` ## Training - **Backbone LR:** 2e-5 with cosine decay + 10% linear warmup - **Head LR:** 2e-4 (10x backbone) - **Epochs:** 3 (best checkpoint at epoch 2) - **Batch size:** 32 - **Device:** Apple M3 Max (MPS) - **Training data:** RuSentiment (50K), GoEmotions mapped to 4-class (50K), WikiAnn-ru NER (50K), synthetic negotiation tags (1.7K) - **Multi-task loss:** weighted CE (sentiment) + BCE (tags) + CE with ignore_index (NER) ## Metrics (Validation) | Head | Metric | Value | |------|--------|-------| | Sentiment | Accuracy | **76.6%** | | Tags | F1 (macro) | **57.0%** | | NER | Accuracy | **94.8%** | | Combined | Val Loss | 0.492 | ## Model Files | File | Size | Description | |------|------|-------------| | `XLMRobertaMultiHead.mlpackage/` | 529 MB | FP16 model | | `XLMRobertaMultiHead_INT8.mlpackage/` | 266 MB | INT8 quantized (recommended) | | `label_definitions.json` | 1 KB | Label mappings for all heads | | `config.json` | — | Model configuration | ## Usage ```swift import CoreML let model = try MLModel(contentsOf: modelURL) // Prepare inputs (use XLM-RoBERTa tokenizer) let inputArray = try MLMultiArray(shape: [1, 128], dataType: .int32) let maskArray = try MLMultiArray(shape: [1, 128], dataType: .int32) // ... fill with tokenized text ... let input = try MLDictionaryFeatureProvider(dictionary: [ "input_ids": MLFeatureValue(multiArray: inputArray), "attention_mask": MLFeatureValue(multiArray: maskArray) ]) let output = try model.prediction(from: input) let sentimentLogits = output.featureValue(for: "sentiment_logits")!.multiArrayValue! let tagLogits = output.featureValue(for: "tag_logits")!.multiArrayValue! let nerLogits = output.featureValue(for: "ner_logits")!.multiArrayValue! ``` ## Tokenizer This model uses the standard XLM-RoBERTa tokenizer from `FacebookAI/xlm-roberta-base`. CoreML does not include the tokenizer — use `tokenizers` library or bundle `tokenizer.json` separately. ## Attribution Base model [XLM-RoBERTa](https://huggingface.co/FacebookAI/xlm-roberta-base) by Facebook AI. Fine-tuning on Russian/English datasets and CoreML conversion by [@smkrv](https://huggingface.co/smkrv).