---
language:
- vi
library_name: transformers
pipeline_tag: text-classification
tags:
- vietnamese
- summarization
- evaluation
- cross-encoder
- research
---

# MultiEvalVietSum

MultiEvalVietSum is a Vietnamese summary evaluation model released under the Hugging Face account phuongntc.

It is a criterion-specific cross-encoder evaluator that takes a source document and a candidate summary as input and outputs three scalar scores:
- Faithfulness
- Coherence
- Relevance

## Model description

This model is built on top of the multilingual long-context encoder jhu-clsp/mmBERT-base and fine-tuned as a custom evaluator for Vietnamese summarization research.

Architecture summary:
- Backbone: jhu-clsp/mmBERT-base
- Input format: (document, summary) pair
- Pooling: CLS + mean pooling
- Prediction heads: three scalar regression heads
- Criteria: faithfulness, coherence, relevance
- Training objective: MSE regression + pairwise margin ranking loss

## Intended use

This model is intended for:
- research on automatic summary evaluation in Vietnamese
- system comparison for Vietnamese summarization
- criterion-specific scoring of candidate summaries against a source document

This model is not intended to replace human judgment in high-stakes evaluation settings.

## Input processing

The evaluator uses a pairwise input construction strategy:
- the summary is truncated first up to SUM_MAX_LEN = 192
- the remaining token budget is assigned to the source document
- the total pair length is capped at MAX_LEN = 2048

This design prioritizes source-document evidence during evaluation.

## Reported setup

- model_name: MultiEvalVietSum
- repo_id: phuongntc/MultiEvalVietSum
- backbone: jhu-clsp/mmBERT-base
- task: Vietnamese summary evaluation
- max_len: 2048
- summary_max_len: 192
- pooling: CLS + mean pooling
- outputs: faithfulness, coherence, relevance

Validation metrics:
- val_pearson_faith: None
- val_pearson_coh: None
- val_pearson_rel: None
- val_pearson_mean: None
- val_spearman_faith: None
- val_spearman_coh: None
- val_spearman_rel: None
- val_spearman_mean: None

## Output format

The model outputs three scalar scores:
1. faithfulness
2. coherence
3. relevance

Users may optionally combine them into an overall score using a weighting scheme appropriate for their study.

## Limitations

- The model only sees the truncated (document, summary) pair defined by the preprocessing pipeline
- Very long documents may be partially invisible to the evaluator
- If a candidate summary is longer than the summary cap, only the visible portion is evaluated
- Performance may vary across domains outside the training or evaluation distribution

## Transparency and reproducibility notes

To reproduce scores as closely as possible, users should keep the following consistent:
- backbone model
- tokenizer
- MAX_LEN
- SUM_MAX_LEN
- pair construction rule
- model architecture and checkpoint

The repository includes:
- tokenizer files
- evaluator weights
- a custom loader file
- an inference example
- a training summary file

## How to use

After downloading the repo, use the included files:
- modeling_multievalvietsum.py
- inference_example.py

Example:
1. Download or clone the repository
2. Open Python in that folder
3. Run:
   from inference_example import predict_scores
   scores = predict_scores("Văn bản gốc", "Bản tóm tắt", model_dir=".")
   print(scores)

## Citation

@misc{phuong2026multievalvietsum,
  title={MultiEvalVietSum: A Vietnamese Criterion-Specific Evaluator for Summary Assessment},
  author={Phuong N. T. and collaborators},
  year={2026},
  note={Model card and code release on Hugging Face},
  howpublished={\url{https://huggingface.co/phuongntc/MultiEvalVietSum}}
}