krotima1 commited on
Commit
33ce6d4
·
verified ·
1 Parent(s): 5944a56

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -2
README.md CHANGED
@@ -7,11 +7,29 @@ metrics:
7
  - bleurt
8
  - bleu
9
  - bertscore
10
- pipeline_tag: sentence-similarity
11
  ---
12
  # AlignScoreCS
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  MultiTask multilingual model for assessing facticity in various NLU tasks in Czech and English language. We followed the initial paper AlignScore https://arxiv.org/abs/2305.16739.
14
- We trained a model using a shared architecture of checkpoint xlm-roberta-large https://huggingface.co/FacebookAI/xlm-roberta-large with three linear layers for regression,
15
  binary classification and ternary classification.
16
 
17
 
 
7
  - bleurt
8
  - bleu
9
  - bertscore
 
10
  ---
11
  # AlignScoreCS
12
+
13
+ A MultiTask multilingual model is developed to assess factual consistency in context-claim pairs across various Natural Language Understanding (NLU) tasks,
14
+ including Summarization, Question Answering (QA), Semantic Textual Similarity (STS), Paraphrase, Fact Verification (FV), and Natural Language Inference (NLI).
15
+ AlignScoreCS is fine-tuned on a vast multi-task dataset consisting of 7 million documents, encompassing these NLU tasks in both Czech and English languages.
16
+ Its multilingual pre-training enables its potential utilization in various other languages. The architecture is capable of processing tasks using regression,
17
+ binary classification, or ternary classification, although for evaluation purposes, we recommend employing the AlignScore function.
18
+
19
+ This work is influenced by its English counterpart [AlignScore: Evaluating Factual Consistency with a Unified Alignment Function](https://arxiv.org/abs/2305.16739).
20
+ However, we employed homogeneous batches instead of heterogeneous ones during training and utilized three distinct architectures sharing a single encoder.
21
+ This setup allows for the independent use of each architecture with its classification head.
22
+
23
+
24
+ ## Evaluation
25
+ As in the paper AlignScore, we use their AlignScore function which chunk context into roughly 350 tokens and splits claim into sentences
26
+ each context chunk is evaluated against each claim sentence and aggregated one consistency score
27
+
28
+ AlignScoreCS model is built on three XLM-RoBERTa architectures sharing one encoder
29
+
30
+
31
  MultiTask multilingual model for assessing facticity in various NLU tasks in Czech and English language. We followed the initial paper AlignScore https://arxiv.org/abs/2305.16739.
32
+ We trained a model using a shared architecture of checkpoint xlm-roberta-large [xlm-roberta](https://huggingface.co/FacebookAI/xlm-roberta-large) with three linear layers for regression,
33
  binary classification and ternary classification.
34
 
35