songhieng's picture
Update README.md
548a3f2 verified
metadata
tags:
  - summarization
  - mt5
  - khmer
  - text2text-generation
license: mit

Khmer mT5 Summarization Model (Duplicated Text)

This repository contains a fine-tuned mT5-small model for Khmer text summarization that is specially trained to collapse duplicated or redundant content into concise, coherent summaries.


Model Details

  • Base model: google/mt5-small
  • Fine-tuned for: Khmer summarization with duplicate-text removal
  • Training dataset: kimleang123/khmer-text-dataset-duplicated
  • Task: Sequence-to-Sequence (text2text-generation)
  • Evaluation: ROUGE-1/2/L on held-out Khmer articles containing repeated passages