JA-Base-CP

Japanese monolingual base model continued on the SciLaD target-language split as a 15k-step control baseline.

Model Details

This is a monolingual continued-pretraining control checkpoint reported in the paper table. It is provided for reproducibility of the baseline comparison.

Paper name: JA-Base-CP
Model role: baseline-control
Source/base model: megagonlabs/t5-base-japanese-web
Code and pipeline: GitHub repository
Architecture: T5 encoder-decoder
SciLaD dataset: scilons/SciLaD-all-text-v1
Evaluation benchmark: Global-MMLU

Evaluation

Zero-shot Global-MMLU accuracy reported by the paper aggregation:

Metric	Accuracy
Average	22.95
STEM	21.25
Humanities	24.23
Social Sciences	21.71
Other	23.98

Limitations

The model is evaluated primarily with zero-shot Global-MMLU. Downstream task-specific evaluation is recommended before deployment in specialized scientific workflows.

Citation

Title: Transferring Scientific English Pre-Trained Language Models to Multiple Languages Using Cross-Lingual Transfer
Authors: Nikolas Rauscher, Fabio Barth, Georg Rehm
Venue: LREC-COLING 2026, citation details TBA after publication

Downloads last month: 14

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rausch/ja-t5-base-sci-cp-15k

Base model

megagonlabs/t5-base-japanese-web

Finetuned

(1)

this model

Dataset used to train rausch/ja-t5-base-sci-cp-15k

Collection including rausch/ja-t5-base-sci-cp-15k

scientific-multilingual-transfer

Collection

This Collection contains the Models from the paper [TBA Link] • 13 items • Updated 2 days ago