Instructions to use Shaelois/MeetingScript with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Shaelois/MeetingScript with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "summarization" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("summarization", model="Shaelois/MeetingScript")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("Shaelois/MeetingScript") model = AutoModelForSeq2SeqLM.from_pretrained("Shaelois/MeetingScript") - Notebooks
- Google Colab
- Kaggle
# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("Shaelois/MeetingScript")
model = AutoModelForSeq2SeqLM.from_pretrained("Shaelois/MeetingScript")MeetingScript
A BigBird‐Pegasus model fine‑tuned for meeting transcription summarization on the MeetingBank dataset.
📦 Model Files
- Weights & config:
pytorch_model.bin,config.json - Tokenizer:
tokenizer.json,tokenizer_config.json,merges.txt,special_tokens_map.json - Generation defaults:
generation_config.json
🔗 Hub: https://github.com/kevin0437/Meeting_scripts
Model Description
MeetingScript is a sequence‑to‑sequence model based on
google/bigbird-pegasus-large-bigpatent
and fine‑tuned on the MeetingBank corpus of meeting transcripts paired with human‐written summaries.
It is designed to take long meeting transcripts (up to 4096 tokens) and produce concise, coherent summaries.
Evaluation Results
Evaluated on the held‑out test split of MeetingBank (≈ 600 transcripts), using beam search (4 beams, max_length=600):
| Metric | F1 Score (%) |
|---|---|
| ROUGE‑1 | 51.5556 |
| ROUGE‑2 | 38.5378 |
| ROUGE‑L | 48.0786 |
| ROUGE‑Lsum | 48.0142 |
Usage
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch
# 1) Load from the Hub
tokenizer = AutoTokenizer.from_pretrained("Shaelois/MeetingScript")
model = AutoModelForSeq2SeqLM.from_pretrained("Shaelois/MeetingScript")
# 2) Summarize a long transcript
transcript = """
Alice: Good morning everyone, let’s get started…
Bob: I updated the design mockups…
… (thousands of words) …
"""
inputs = tokenizer(
transcript,
max_length=4096,
truncation=True,
return_tensors="pt"
).to("cuda")
summary_ids = model.generate(
**inputs,
num_beams=4,
max_length=150,
early_stopping=True
)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print("📝 Summary:", summary)
Training Data
Dataset: MeetingBank Splits: Train (5000+), Validation (600+), Test (600+) Preprocessing: Sliding‑window chunking for sequences > 4096 tokens
- Downloads last month
- 2
Model tree for Shaelois/MeetingScript
Base model
google/bigbird-pegasus-large-bigpatent
# Use a pipeline as a high-level helper # Warning: Pipeline type "summarization" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("summarization", model="Shaelois/MeetingScript")