Polish Text Summarizer

FLAN-T5-base fine-tuned for Polish text summarization.

Model Details

  • Base model: google/flan-t5-base (248M parameters)
  • Task: Text summarization
  • Language: Polish
  • Dataset: allegro/summarization-polish-summaries-corpus

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("PiotrWarzachowski/polish-text-summarizer")
model = AutoModelForSeq2SeqLM.from_pretrained("PiotrWarzachowski/polish-text-summarizer")

article = "Twój długi artykuł po polsku..."

inputs = tokenizer(article, max_length=512, truncation=True, return_tensors="pt")
outputs = model.generate(**inputs, max_length=128, num_beams=4, no_repeat_ngram_size=3)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(summary)

Limitations

  • Max input: 512 tokens (~2000-3000 characters)
  • Max output: 128 tokens (~500 characters)
  • Polish diacritics (ą, ę, ł, etc.) may be simplified to ASCII equivalents

Training

  • Optimizer: Adafactor
  • Batch size: 1 (with gradient accumulation 8)
  • Epochs: 3
  • Learning rate: 1e-4
Downloads last month
-
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train PiotrWarzachowski/polish-text-summarizer