Text Classification
Transformers
PyTorch
Vietnamese
vietnamese
summarization
evaluation
cross-encoder
research
Instructions to use phuongntc/MultiEvalVietSum with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use phuongntc/MultiEvalVietSum with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="phuongntc/MultiEvalVietSum")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("phuongntc/MultiEvalVietSum", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| language: | |
| - vi | |
| library_name: transformers | |
| pipeline_tag: text-classification | |
| tags: | |
| - vietnamese | |
| - summarization | |
| - evaluation | |
| - cross-encoder | |
| - research | |
| # MultiEvalVietSum | |
| MultiEvalVietSum is a Vietnamese summary evaluation model released under the Hugging Face account phuongntc. | |
| It is a criterion-specific cross-encoder evaluator that takes a source document and a candidate summary as input and outputs three scalar scores: | |
| - Faithfulness | |
| - Coherence | |
| - Relevance | |
| ## Model description | |
| This model is built on top of the multilingual long-context encoder jhu-clsp/mmBERT-base and fine-tuned as a custom evaluator for Vietnamese summarization research. | |
| Architecture summary: | |
| - Backbone: jhu-clsp/mmBERT-base | |
| - Input format: (document, summary) pair | |
| - Pooling: CLS + mean pooling | |
| - Prediction heads: three scalar regression heads | |
| - Criteria: faithfulness, coherence, relevance | |
| - Training objective: MSE regression + pairwise margin ranking loss | |
| ## Intended use | |
| This model is intended for: | |
| - research on automatic summary evaluation in Vietnamese | |
| - system comparison for Vietnamese summarization | |
| - criterion-specific scoring of candidate summaries against a source document | |
| This model is not intended to replace human judgment in high-stakes evaluation settings. | |
| ## Input processing | |
| The evaluator uses a pairwise input construction strategy: | |
| - the summary is truncated first up to SUM_MAX_LEN = 192 | |
| - the remaining token budget is assigned to the source document | |
| - the total pair length is capped at MAX_LEN = 2048 | |
| This design prioritizes source-document evidence during evaluation. | |
| ## Reported setup | |
| - model_name: MultiEvalVietSum | |
| - repo_id: phuongntc/MultiEvalVietSum | |
| - backbone: jhu-clsp/mmBERT-base | |
| - task: Vietnamese summary evaluation | |
| - max_len: 2048 | |
| - summary_max_len: 192 | |
| - pooling: CLS + mean pooling | |
| - outputs: faithfulness, coherence, relevance | |
| Validation metrics: | |
| - val_pearson_faith: None | |
| - val_pearson_coh: None | |
| - val_pearson_rel: None | |
| - val_pearson_mean: None | |
| - val_spearman_faith: None | |
| - val_spearman_coh: None | |
| - val_spearman_rel: None | |
| - val_spearman_mean: None | |
| ## Output format | |
| The model outputs three scalar scores: | |
| 1. faithfulness | |
| 2. coherence | |
| 3. relevance | |
| Users may optionally combine them into an overall score using a weighting scheme appropriate for their study. | |
| ## Limitations | |
| - The model only sees the truncated (document, summary) pair defined by the preprocessing pipeline | |
| - Very long documents may be partially invisible to the evaluator | |
| - If a candidate summary is longer than the summary cap, only the visible portion is evaluated | |
| - Performance may vary across domains outside the training or evaluation distribution | |
| ## Transparency and reproducibility notes | |
| To reproduce scores as closely as possible, users should keep the following consistent: | |
| - backbone model | |
| - tokenizer | |
| - MAX_LEN | |
| - SUM_MAX_LEN | |
| - pair construction rule | |
| - model architecture and checkpoint | |
| The repository includes: | |
| - tokenizer files | |
| - evaluator weights | |
| - a custom loader file | |
| - an inference example | |
| - a training summary file | |
| ## How to use | |
| After downloading the repo, use the included files: | |
| - modeling_multievalvietsum.py | |
| - inference_example.py | |
| Example: | |
| 1. Download or clone the repository | |
| 2. Open Python in that folder | |
| 3. Run: | |
| from inference_example import predict_scores | |
| scores = predict_scores("Văn bản gốc", "Bản tóm tắt", model_dir=".") | |
| print(scores) | |
| ## Citation | |
| @misc{phuong2026multievalvietsum, | |
| title={MultiEvalVietSum: A Vietnamese Criterion-Specific Evaluator for Summary Assessment}, | |
| author={Phuong N. T. and collaborators}, | |
| year={2026}, | |
| note={Model card and code release on Hugging Face}, | |
| howpublished={\url{https://huggingface.co/phuongntc/MultiEvalVietSum}} | |
| } | |