--- language: - vi library_name: transformers pipeline_tag: text-classification tags: - vietnamese - summarization - evaluation - cross-encoder - research --- # MultiEvalVietSum MultiEvalVietSum is a Vietnamese summary evaluation model released under the Hugging Face account phuongntc. It is a criterion-specific cross-encoder evaluator that takes a source document and a candidate summary as input and outputs three scalar scores: - Faithfulness - Coherence - Relevance ## Model description This model is built on top of the multilingual long-context encoder jhu-clsp/mmBERT-base and fine-tuned as a custom evaluator for Vietnamese summarization research. Architecture summary: - Backbone: jhu-clsp/mmBERT-base - Input format: (document, summary) pair - Pooling: CLS + mean pooling - Prediction heads: three scalar regression heads - Criteria: faithfulness, coherence, relevance - Training objective: MSE regression + pairwise margin ranking loss ## Intended use This model is intended for: - research on automatic summary evaluation in Vietnamese - system comparison for Vietnamese summarization - criterion-specific scoring of candidate summaries against a source document This model is not intended to replace human judgment in high-stakes evaluation settings. ## Input processing The evaluator uses a pairwise input construction strategy: - the summary is truncated first up to SUM_MAX_LEN = 192 - the remaining token budget is assigned to the source document - the total pair length is capped at MAX_LEN = 2048 This design prioritizes source-document evidence during evaluation. ## Reported setup - model_name: MultiEvalVietSum - repo_id: phuongntc/MultiEvalVietSum - backbone: jhu-clsp/mmBERT-base - task: Vietnamese summary evaluation - max_len: 2048 - summary_max_len: 192 - pooling: CLS + mean pooling - outputs: faithfulness, coherence, relevance Validation metrics: - val_pearson_faith: None - val_pearson_coh: None - val_pearson_rel: None - val_pearson_mean: None - val_spearman_faith: None - val_spearman_coh: None - val_spearman_rel: None - val_spearman_mean: None ## Output format The model outputs three scalar scores: 1. faithfulness 2. coherence 3. relevance Users may optionally combine them into an overall score using a weighting scheme appropriate for their study. ## Limitations - The model only sees the truncated (document, summary) pair defined by the preprocessing pipeline - Very long documents may be partially invisible to the evaluator - If a candidate summary is longer than the summary cap, only the visible portion is evaluated - Performance may vary across domains outside the training or evaluation distribution ## Transparency and reproducibility notes To reproduce scores as closely as possible, users should keep the following consistent: - backbone model - tokenizer - MAX_LEN - SUM_MAX_LEN - pair construction rule - model architecture and checkpoint The repository includes: - tokenizer files - evaluator weights - a custom loader file - an inference example - a training summary file ## How to use After downloading the repo, use the included files: - modeling_multievalvietsum.py - inference_example.py Example: 1. Download or clone the repository 2. Open Python in that folder 3. Run: from inference_example import predict_scores scores = predict_scores("Văn bản gốc", "Bản tóm tắt", model_dir=".") print(scores) ## Citation @misc{phuong2026multievalvietsum, title={MultiEvalVietSum: A Vietnamese Criterion-Specific Evaluator for Summary Assessment}, author={Phuong N. T. and collaborators}, year={2026}, note={Model card and code release on Hugging Face}, howpublished={\url{https://huggingface.co/phuongntc/MultiEvalVietSum}} }