Upload MultiEvalVietSum: weights, tokenizer, config, code, and model card

2bcedff verified about 2 months ago

3.73 kB

	---
	language:
	- vi
	library_name: transformers
	pipeline_tag: text-classification
	tags:
	- vietnamese
	- summarization
	- evaluation
	- cross-encoder
	- research
	---

	# MultiEvalVietSum

	MultiEvalVietSum is a Vietnamese summary evaluation model released under the Hugging Face account phuongntc.

	It is a criterion-specific cross-encoder evaluator that takes a source document and a candidate summary as input and outputs three scalar scores:
	- Faithfulness
	- Coherence
	- Relevance

	## Model description

	This model is built on top of the multilingual long-context encoder jhu-clsp/mmBERT-base and fine-tuned as a custom evaluator for Vietnamese summarization research.

	Architecture summary:
	- Backbone: jhu-clsp/mmBERT-base
	- Input format: (document, summary) pair
	- Pooling: CLS + mean pooling
	- Prediction heads: three scalar regression heads
	- Criteria: faithfulness, coherence, relevance
	- Training objective: MSE regression + pairwise margin ranking loss

	## Intended use

	This model is intended for:
	- research on automatic summary evaluation in Vietnamese
	- system comparison for Vietnamese summarization
	- criterion-specific scoring of candidate summaries against a source document

	This model is not intended to replace human judgment in high-stakes evaluation settings.

	## Input processing

	The evaluator uses a pairwise input construction strategy:
	- the summary is truncated first up to SUM_MAX_LEN = 192
	- the remaining token budget is assigned to the source document
	- the total pair length is capped at MAX_LEN = 2048

	This design prioritizes source-document evidence during evaluation.

	## Reported setup

	- model_name: MultiEvalVietSum
	- repo_id: phuongntc/MultiEvalVietSum
	- backbone: jhu-clsp/mmBERT-base
	- task: Vietnamese summary evaluation
	- max_len: 2048
	- summary_max_len: 192
	- pooling: CLS + mean pooling
	- outputs: faithfulness, coherence, relevance

	Validation metrics:
	- val_pearson_faith: None
	- val_pearson_coh: None
	- val_pearson_rel: None
	- val_pearson_mean: None
	- val_spearman_faith: None
	- val_spearman_coh: None
	- val_spearman_rel: None
	- val_spearman_mean: None

	## Output format

	The model outputs three scalar scores:
	1. faithfulness
	2. coherence
	3. relevance

	Users may optionally combine them into an overall score using a weighting scheme appropriate for their study.

	## Limitations

	- The model only sees the truncated (document, summary) pair defined by the preprocessing pipeline
	- Very long documents may be partially invisible to the evaluator
	- If a candidate summary is longer than the summary cap, only the visible portion is evaluated
	- Performance may vary across domains outside the training or evaluation distribution

	## Transparency and reproducibility notes

	To reproduce scores as closely as possible, users should keep the following consistent:
	- backbone model
	- tokenizer
	- MAX_LEN
	- SUM_MAX_LEN
	- pair construction rule
	- model architecture and checkpoint

	The repository includes:
	- tokenizer files
	- evaluator weights
	- a custom loader file
	- an inference example
	- a training summary file

	## How to use

	After downloading the repo, use the included files:
	- modeling_multievalvietsum.py
	- inference_example.py

	Example:
	1. Download or clone the repository
	2. Open Python in that folder
	3. Run:
	from inference_example import predict_scores
	scores = predict_scores("Văn bản gốc", "Bản tóm tắt", model_dir=".")
	print(scores)

	## Citation

	@misc{phuong2026multievalvietsum,
	title={MultiEvalVietSum: A Vietnamese Criterion-Specific Evaluator for Summary Assessment},
	author={Phuong N. T. and collaborators},
	year={2026},
	note={Model card and code release on Hugging Face},
	howpublished={\url{https://huggingface.co/phuongntc/MultiEvalVietSum}}
	}