CleanTS Model Card

Introduction

CleanTS is a competitive pre-trained baseline for time-series forecasting. It is designed to demonstrate the potential of data-centric optimization using a standard encoder-only Transformer architecture without any structural modifications.

Key Technical Principles

1. Pure Architecture

Zero modifications to the attention mechanism or model structure. CleanTS proves that high-quality data governance can significantly elevate the performance of vanilla architectures.

2. Systematic Data Governance

CleanTS adheres to a strict data-centric pipeline:

  • Zero Synthetic Data: All training is performed exclusively on authentic, real-world data.
  • Publicly Sourced: The training corpus consists entirely of publicly available datasets, ensuring transparency and accessibility.
  • Advanced Cleaning: We achieve promising results solely through systematic data cleaning and preprocessing strategies.

3. Contamination Prevention

We strictly guarantee that no data from the GiftEval test set,Fev-bench,Fev-leaderboard and LSTF was involved in the training phase.

A detailed technical report, including our specific data cleaning methodologies, training configurations, and comprehensive ablation studies, will be released concurrently with the upcoming publication of the full model.

Downloads last month
20
Safetensors
Model size
65.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support