metadata
tags:
- summarization
- mt5
- khmer
- text2text-generation
license: mit
Khmer mT5 Summarization Model (Duplicated Text)
This repository contains a fine-tuned mT5-small model for Khmer text summarization that is specially trained to collapse duplicated or redundant content into concise, coherent summaries.
Model Details
- Base model:
google/mt5-small - Fine-tuned for: Khmer summarization with duplicate-text removal
- Training dataset:
kimleang123/khmer-text-dataset-duplicated - Task: Sequence-to-Sequence (
text2text-generation) - Evaluation: ROUGE-1/2/L on held-out Khmer articles containing repeated passages