krotima1
/

AlignScoreCS

Text Classification

text-embeddings-inference

Model card Files Files and versions

krotima1 commited on May 21, 2024

Commit

33ce6d4

·

verified ·

1 Parent(s): 5944a56

Update README.md

Files changed (1) hide show

README.md +20 -2

README.md CHANGED Viewed

@@ -7,11 +7,29 @@ metrics:
 - bleurt
 - bleu
 - bertscore
-pipeline_tag: sentence-similarity
 ---
 # AlignScoreCS
 MultiTask multilingual model for assessing facticity in various NLU tasks in Czech and English language. We followed the initial paper AlignScore https://arxiv.org/abs/2305.16739.
-We trained a model using a shared architecture of checkpoint xlm-roberta-large https://huggingface.co/FacebookAI/xlm-roberta-large with three linear layers for regression,
 binary classification and ternary classification.

 - bleurt
 - bleu
 - bertscore
 ---
 # AlignScoreCS
+A MultiTask multilingual model is developed to assess factual consistency in context-claim pairs across various Natural Language Understanding (NLU) tasks,
+including Summarization, Question Answering (QA), Semantic Textual Similarity (STS), Paraphrase, Fact Verification (FV), and Natural Language Inference (NLI).
+AlignScoreCS is fine-tuned on a vast multi-task dataset consisting of 7 million documents, encompassing these NLU tasks in both Czech and English languages.
+Its multilingual pre-training enables its potential utilization in various other languages. The architecture is capable of processing tasks using regression,
+binary classification, or ternary classification, although for evaluation purposes, we recommend employing the AlignScore function.
+This work is influenced by its English counterpart [AlignScore: Evaluating Factual Consistency with a Unified Alignment Function](https://arxiv.org/abs/2305.16739).
+However, we employed homogeneous batches instead of heterogeneous ones during training and utilized three distinct architectures sharing a single encoder.
+This setup allows for the independent use of each architecture with its classification head.
+## Evaluation
+As in the paper AlignScore, we use their AlignScore function which chunk context into roughly 350 tokens and splits claim into sentences
+each context chunk is evaluated against each claim sentence and aggregated one consistency score
+AlignScoreCS model is built on three XLM-RoBERTa architectures sharing one encoder
 MultiTask multilingual model for assessing facticity in various NLU tasks in Czech and English language. We followed the initial paper AlignScore https://arxiv.org/abs/2305.16739.
+We trained a model using a shared architecture of checkpoint xlm-roberta-large [xlm-roberta](https://huggingface.co/FacebookAI/xlm-roberta-large) with three linear layers for regression,
 binary classification and ternary classification.