|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: google/flan-t5-small |
|
|
tags: |
|
|
- summarization |
|
|
- meeting-summarization |
|
|
- text-generation-inference |
|
|
- transformers |
|
|
datasets: |
|
|
- qmsum |
|
|
language: |
|
|
- en |
|
|
--- |
|
|
|
|
|
# Meeting Summarizer |
|
|
|
|
|
This model is a fine-tuned version of **google/flan-t5-small** for meeting summarization tasks. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Base Model:** google/flan-t5-small |
|
|
- **Task:** Abstractive Meeting Summarization |
|
|
- **Training Data:** QMSum Dataset + Enhanced Training |
|
|
- **Parameters:** ~60.5M parameters |
|
|
- **Max Input Length:** 256 tokens |
|
|
- **Max Output Length:** 64 tokens |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("CodeXRyu/meeting-summarizer-v2") |
|
|
model = AutoModelForSeq2SeqLM.from_pretrained("CodeXRyu/meeting-summarizer-v2") |
|
|
|
|
|
# Example usage |
|
|
meeting_text = "Your meeting transcript here..." |
|
|
inputs = tokenizer.encode(meeting_text, return_tensors="pt", max_length=256, truncation=True) |
|
|
outputs = model.generate(inputs, max_length=64, num_beams=4, early_stopping=True) |
|
|
summary = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
print(summary) |
|
|
``` |
|
|
|
|
|
## Training Configuration |
|
|
|
|
|
- **Max Input Length:** 256 tokens |
|
|
- **Max Output Length:** 64 tokens |
|
|
- **Training:** Fine-tuned on meeting summarization data |
|
|
|
|
|
--- |
|
|
*This model was trained for meeting summarization tasks.* |
|
|
|