|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- google-t5/t5-base |
|
|
pipeline_tag: summarization |
|
|
--- |
|
|
|
|
|
**Model Name:** LoRA Fine-Tuned Model for Dialogue Summarization |
|
|
**Model Type:** Seq2Seq with Low-Rank Adaptation (LoRA) |
|
|
**Base Model:** `google/t5-base` |
|
|
|
|
|
## Model Details |
|
|
- **Architecture**: T5-base |
|
|
- **Finetuning Technique**: LoRA (Low-Rank Adaptation) |
|
|
- **PEFT Method**: Parameter Efficient Fine-Tuning |
|
|
- **Data**: samsumdataset |
|
|
- **Metrics**: Evaluated using ROUGE (ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-Lsum) |
|
|
|
|
|
## Intended Use |
|
|
This model is designed for summarizing dialogues, such as conversations between individuals in a chat or messaging context. It’s suitable for applications in: |
|
|
- **Customer Service**: Summarizing chat logs for quality monitoring or training. |
|
|
- **Messaging Apps**: Generating conversation summaries for user convenience. |
|
|
- **Content Creation**: Assisting writers by summarizing character dialogues. |
|
|
|
|
|
## Training Process |
|
|
|
|
|
Optimizer: AdamW with learning rate 3e-5 |
|
|
|
|
|
Batch Size: 4 (gradient accumulation steps of 2) |
|
|
|
|
|
Training Epochs: 2 |
|
|
|
|
|
Evaluation Metrics: ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-Lsum |
|
|
|
|
|
Hardware: Trained on a single GPU with mixed precision to optimize performance. |
|
|
|
|
|
The model was trained using the Seq2SeqTrainer class from transformers, with LoRA parameters applied to selected attention layers to reduce computation without compromising accuracy. |