YAML Metadata Warning: The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other

vietnamese-correction-lora-v3

This model is a fine-tuned version of vinai/bartpho-syllable on an bmd1905/vi-error-correction-2.0 dataset. It achieves the following results on the evaluation set:

"epoch": 5.0,
"eval_f1_score": 0.5853658536585366,
"eval_loss": 0.12532223761081696,
"eval_precision": 1.0,
"eval_recall": 0.41379310344827586,
"eval_runtime": 18757.1356,
"eval_sacrebleu": 37.83399298017529,
"eval_samples_per_second": 6.85,
"eval_steps_per_second": 1.142,
"step": 27775

Training results

Training Loss	Epoch	Step	Validation Loss	Sacrebleu	Precision	Recall	F1 Score
0.2162	1	16666	0.3123	32.2314	1.0000	0.4138	0.5854
0.1994	2	22220	0.2335	34.4627	1.0000	0.4138	0.5854
0.1982	3	16666	0.1481	37.7962	1.0000	0.4138	0.5854
0.1272	4	22220	0.1272	37.7962	1.0000	0.4138	0.5854
0.1253	5	27775	0.1253	37.8340	1.0000	0.4138	0.5854

Training and evaluation data

DatasetDict({
    train: Dataset({
        features: ['input', 'output'],
        num_rows: 1_000_000
    })
    val: Dataset({
        features: ['input', 'output'],
        num_rows: 200_000
    })
    test: Dataset({
        features: ['input', 'output'],
        num_rows: 40_000
    })
})

Training procedure

The following bitsandbytes quantization config was used during training:

load_in_8bit: False
load_in_4bit: True
llm_int8_threshold: 6.0
llm_int8_skip_modules: None
llm_int8_enable_fp32_cpu_offload: False
llm_int8_has_fp16_weight: False
bnb_4bit_quant_type: nf4
bnb_4bit_use_double_quant: True
bnb_4bit_compute_dtype: float16

Training hyperparameters

The following hyperparameters were used during training:

trainable params: 25,165,824 || all params: 326,801,408 || trainable%: 7.700647360735974

The model is loaded in 8-bit precision. To train this model you need to add additional modules inside the model such as adapters using peft library and freeze the model weights. Please check the examples in https://github.com/huggingface/peft for more details.
Num examples = 1_000_000
Num Epochs = 5
Instantaneous batch size per device = 6
Total train batch size (w. parallel, distributed & accumulation) = 144
Gradient Accumulation steps = 24
Total optimization steps = 27,775
Number of trainable parameters = 25,165,824

Framework versions

PEFT 0.4.0
Transformers 4.47.0
Pytorch 2.5.1+cu121
Datasets 3.3.1
Tokenizers 0.21.0

Training procedure

The following bitsandbytes quantization config was used during training:

load_in_8bit: False
load_in_4bit: True
llm_int8_threshold: 6.0
llm_int8_skip_modules: None
llm_int8_enable_fp32_cpu_offload: False
llm_int8_has_fp16_weight: False
bnb_4bit_quant_type: nf4
bnb_4bit_use_double_quant: True
bnb_4bit_compute_dtype: float16

Downloads last month: -

Model tree for TungCan/vietnamese-correction-lora-v3

Base model

vinai/bartpho-syllable

Adapter

(7)

this model

TungCan
/

vietnamese-correction-lora-v3

vietnamese-correction-lora-v3

Training results

Training and evaluation data

Training procedure

Training hyperparameters

Framework versions

Training procedure

Model tree for TungCan/vietnamese-correction-lora-v3

Dataset used to train TungCan/vietnamese-correction-lora-v3