TRL-Bench pretrained checkpoints
Mirror of upstream pretrained weights for models evaluated in TRL-Bench (paper: arXiv:2606.09323; code: LOGO-CUHKSZ/TRL-Bench). Each subdirectory contains:
- a copy of the upstream
LICENSE - a
NOTICEfile with citation, provenance, and SHA256 - the binary checkpoint(s)
The checkpoints have not been modified — they are byte-identical to the upstream releases. This repo exists only to provide a reliable, scriptable download path that does not depend on third-party file-share services (Google Drive, SharePoint) which can rate-limit or expire links.
Coverage
| Model | License (upstream) | Citation | Note |
|---|---|---|---|
| TUTA | MIT (Microsoft) | Wang et al., KDD 2021 | |
| TURL | Apache-2.0 (sunlab-osu) | Deng et al., PVLDB 14(3), 2020 | |
| TaBERT | CC BY-NC 4.0 (Facebook AI Research) | Yin et al., ACL 2020 | Non-commercial only — see tabert/NOTICE |
TaBERT is mirrored under the upstream CC BY-NC 4.0 terms strictly for
non-commercial reproduction of the paper experiments. If you intend a
commercial application you may not use that checkpoint; train your own or
seek an alternate license from the original authors (see tabert/NOTICE).
Other TRL-Bench upstream models — TabSketchFM (CC BY-NC-ND 4.0, upstream
only), Starmie (user-trained), and TABBIE (obtain from upstream) — are
not mirrored here. See docs/CHECKPOINT_LICENSES.md and
scripts/download_checkpoints.sh in the
TRL-Bench repo for the upstream
paths used for those models.
Downloading
Programmatic (huggingface_hub):
from huggingface_hub import snapshot_download
snapshot_download(
"logo-lab/trl-bench-ckpts",
allow_patterns=["tuta/*", "turl/*", "tabert/*"],
local_dir="./checkpoints",
)
Or via TRL-Bench's bundled script (verifies SHA256 against
scripts/checksums.sha256 after download):
git clone https://github.com/LOGO-CUHKSZ/TRL-Bench.git
cd TRL-Bench
bash scripts/download_checkpoints.sh
Layout
tuta/
LICENSE # MIT (upstream Microsoft)
NOTICE # citation, provenance, SHA256
tuta.bin # 511 MB
turl/
LICENSE # Apache-2.0 (upstream sunlab-osu)
NOTICE # citation, provenance, SHA256
pretrained/
config.json
pytorch_model.bin # 1.2 GB
tabert/
LICENSE # CC BY-NC 4.0 (upstream Facebook AI Research)
NOTICE # citation, provenance, SHA256 + NC terms
tabert_base_k3/
model.bin # 266 MB
tb_config.json
version.txt
These paths match the expectations of TRL-Bench's scripts/checksums.sha256
and the per-model wrappers in src/trl_bench/models/{tuta,turl,tabert}/.
Attribution
This is a third-party mirror. The upstream authors are the sole copyright
holders of the model weights. The per-model LICENSE and NOTICE files in
each subdirectory are authoritative.
If you use these checkpoints, cite the upstream papers (see each subdir's
NOTICE). If you use TRL-Bench, also cite:
@article{pang2026trl,
title={TRL-Bench: Standardizing Cross-Paradigm Representation-Level Evaluation of Tabular Encoders},
author={Pang, Wei and Jian, Xiangru and Li, Hehan and Yu, Zhixuan and Xue, Alex and Li, Jinyang and Dong, Zhengyuan and Zhao, Xinjian and Xu, Hao and Zhang, Chao and Cheng, Reynold and {\"O}zsu, M. Tamer and Yu, Tianshu},
journal={arXiv preprint arXiv:2606.09323},
year={2026}
}