--- license: cc-by-sa-4.0 language: - pl tags: - text-classification - encoder-only - polish - inconsistency-detection pipeline_tag: text-classification model-index: - name: asseco-group/roberta-incoherence-classifier results: - task: type: text-classification name: Document inconsistency detection (NLI-like) dataset: name: asseco-group/incoherence-bench type: text split: test metrics: - type: f1 name: F1 (macro) value: 0.91 - type: accuracy name: Accuracy value: 0.91 ---

roberta-incoherence-classifier

Encoder-based classifier for document inconsistency detection in **Polish**. This model evaluates the semantic consistency between two text fragments (e.g. sections of legal, procurement or organizational documents). It follows an NLI-like setup but **redefines labels specifically for document coherence auditing**. This model was **initalized from [PKOBP/polish-roberta-8k](https://huggingface.co/PKOBP/polish-roberta-8k)** and **adapted into an inconsistency classifier** through supervised training on high-quality document-style pairs. --- ## Intended Use * Document consistency auditing (legal, public tender, IT documentation, organizational materials) * Detecting contradicting statements, scope mismatches, term/role/format inconsistencies * NLI‑like semantic relation classification with adapted label semantics **Not intended for:** * Fact-checking against external world knowledge * Non‑Polish language inputs * General misinformation / sentiment / toxicity detection Finetuning on specific domain data is recommended for best production accuracy. --- ## Label Definition (Adapted vs. Classical NLI) | Label | Meaning | | ----------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **entailment** | Hypothesis is a faithful, condensed or paraphrased restatement of the premise. All critical constraints, actors, conditions and scope remain intact. | | **neutral** | Hypothesis neither follows nor contradicts the premise. Typically introduces unverifiable or out‑of‑scope information (e.g. different institutions, expanded context, unrelated assumptions). | | **contradiction** | Hypothesis directly conflicts with the premise: reverses permissions/requirements, changes legal scope, numeric limits, formats, dates, or the responsible authority or both statements cannot realistically be true at the same time. | **Rule:** A single critical mismatch (date / territory / authority / format / obligation vs. optional) is sufficient for `contradiction`, even if most of the text agrees. --- ## Model Details * Base architecture: **RoBERTa‑large (encoder‑only)** * Classification head: standard HF linear head on pooled representation * Language: **Polish only** * License: **CC-BY-SA 4.0** * Repository: `asseco-group/roberta-incoherence-classifier` --- ## Training * Precision: **bfloat16** * Epochs: **5** * Global batch: 96 × 2 devices, `gradient_accumulation_steps=11` * Learning rate: `2e-5`, warmup ratio: `0.1`, weight decay: `0.01` * Label smoothing: `0.05` * Gradient checkpointing: **True** * Model selection: best **macro F1** on validation --- ## Dataset * ~**1.3M** labeled pairs (train + val + test) * Balanced class distribution * Data sources include: - Polish subset of [MoritzLaurer/multilingual-NLI-26lang-2mil7](https://huggingface.co/datasets/MoritzLaurer/multilingual-NLI-26lang-2mil7) (only high‑quality Polish NLI pairs) - Synthetic high‑quality document‑style pairs generated specifically for this inconsistency detection task - No additional classical NLI datasets were used, standard NLI label semantics do not fully align with this model’s stricter document‑consistency definitions * Focus on Polish formal/procedural language (laws, tenders, IT specs, institutional instructions) --- ## Evaluation (on [asseco-group/incoherence-bench](https://huggingface.co/datasets/asseco-group/incoherence-bench), test split) ``` precision recall f1-score support entailment 0.94 0.90 0.92 150 neutral 0.87 0.91 0.89 150 contradiction 0.93 0.93 0.93 150 accuracy 0.91 450 macro avg 0.91 0.91 0.91 450 weighted avg 0.91 0.91 0.91 450 ``` While the task is NLI-like, the label semantics are redefined for document-level procedural consistency, for which no direct open-source baselines currently exist. --- ## Usage Example (Transformers) ```python import torch from transformers import pipeline device = "cuda" if torch.cuda.is_available() else "cpu" classifier = pipeline( "text-classification", model="asseco-group/roberta-incoherence-classifier", tokenizer="asseco-group/roberta-incoherence-classifier", top_k=None, return_all_scores=True, device=device ) premise = ( "Wykonawca dostarczy pliki w formacie .shp zgodne z oprogramowaniem ArcGIS 10.2, " "wraz z mapami wydrukowanymi w formacie A4." ) hypo = ( "Wykonawca przekaże wyłącznie pliki .kml kompatybilne z QGIS " "i przygotuje dokumentację w formacie A3." ) result = classifier({"text": premise, "text_pair": hypo}) print(result) ``` ### Batch / lower-level ```python import torch from transformers import AutoTokenizer, AutoModelForSequenceClassification name = "asseco-group/roberta-incoherence-classifier" tokenizer = AutoTokenizer.from_pretrained(name, use_fast=True) model = AutoModelForSequenceClassification.from_pretrained(name).eval() device = "cuda" if torch.cuda.is_available() else "cpu" model.to(device) pairs = [ ("Zwrot kosztów w 60 dni ...", "Zwrot kosztów nastąpi w 30 dni ..."), ] enc = tokenzier( [p for p, h in pairs], [h for p, h in pairs], padding=True, truncation=True, max_length=512, return_tensors="pt" ).to(device) with torch.no_grad(): logits = model(**enc).logits probs = logits.softmax(-1).cpu() print(probs) ``` --- ## Limitations & Recommendations * **Polish‑only** checkpoint: out‑of‑language input not supported * Complex tabular / OCR / mixed‑language content may degrade quality * Domain‑specific fine‑tuning is recommended for production --- ## Citation ```bibtex @misc{asseco2025incoherence, title = {Polish RoBERTa-based Incoherence/Consistency Classifier (encoder-only)}, author = {Asseco Group}, year = {2025}, url = {https://huggingface.co/asseco-group/roberta-incoherence-classifier} } ```