--- license: mit language: - en tags: - roberta - text-classification - ensemble - clarity - qevasion pipeline_tag: text-classification --- # RoBERTa Clarity Ensemble This repository contains **3 RoBERTa-large models** fine-tuned for clarity classification (Clear Reply / Clear Non-Reply / Ambivalent). ## Models | Model | Description | |-------|-------------| | `model-1/` | RoBERTa-large fine-tuned on clarity task | | `model-2/` | RoBERTa-large fine-tuned on clarity task (different seed/split) | | `model-3/` | RoBERTa-large fine-tuned on clarity task (different seed/split) | ## Usage ```python from transformers import AutoModelForSequenceClassification, AutoTokenizer import torch # Load one model model = AutoModelForSequenceClassification.from_pretrained("gigibot/ensemble-qeval", subfolder="model-1") tokenizer = AutoTokenizer.from_pretrained("gigibot/ensemble-qeval", subfolder="model-1") # Or load all 3 for ensemble voting models = [] for i in [1, 2, 3]: m = AutoModelForSequenceClassification.from_pretrained( "gigibot/ensemble-qeval", subfolder=f"model-{{i}}" ) models.append(m) # Ensemble inference def ensemble_predict(text, models, tokenizer): inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256) logits_sum = None for model in models: model.eval() with torch.no_grad(): out = model(**inputs) if logits_sum is None: logits_sum = out.logits else: logits_sum += out.logits return torch.argmax(logits_sum, dim=-1).item() ``` ## Labels - 0: Clear Reply - 1: Clear Non-Reply - 2: Ambivalent ## Training Each model was fine-tuned from `roberta-large` on the QEvasion clarity dataset.