scientific-summarizer-v1
Starter model repository for scientific summarization experiments.
Status
Fine-tuned checkpoint uploaded: v0.2 (trained on sft_review_queue_500.jsonl, 1 epoch, batch size 2).
Pilot checkpoint from 10-row run is also available in repo history.
Base model plan
Recommended first baseline: facebook/bart-large-cnn
Dataset
YanJo199/scientific-papers-sft-v1- Start with
pilot/sft_review_queue_10.jsonlfor smoke tests - Scale to
sft_review_queue_500.jsonlafter validation
Training objective
Supervised fine-tuning for the summary task.
Evaluation
Latest evaluation (February 25, 2026):
- Validation file:
data/sft_valid_v6_summary.jsonl - Samples:
100(summary task) - ROUGE-1:
0.4309 - ROUGE-2:
0.3134 - ROUGE-L:
0.3586 - ROUGE-Lsum:
0.3584
Next release target
Run full validation (all rows) and publish a tuned v0.3 checkpoint with updated training settings.
- Downloads last month
- -