CleanTS-65M / README.md
EINK's picture
Upload README.md
1df5937 verified
metadata
tags:
  - model_hub_mixin
  - pytorch_model_hub_mixin

CleanTS Model Card

Introduction

CleanTS is a competitive pre-trained baseline for time-series forecasting. It is designed to demonstrate the potential of data-centric optimization using a standard encoder-only Transformer architecture without any structural modifications.

Key Technical Principles

1. Pure Architecture

Zero modifications to the attention mechanism or model structure. CleanTS proves that high-quality data governance can significantly elevate the performance of vanilla architectures.

2. Systematic Data Governance

CleanTS adheres to a strict data-centric pipeline:

  • Zero Synthetic Data: All training is performed exclusively on authentic, real-world data.
  • Publicly Sourced: The training corpus consists entirely of publicly available datasets, ensuring transparency and accessibility.
  • Advanced Cleaning: We achieve promising results solely through systematic data cleaning and preprocessing strategies.

3. Contamination Prevention

We strictly guarantee that no data from the GiftEval test set,Fev-bench,Fev-leaderboard and LSTF was involved in the training phase.

A detailed technical report, including our specific data cleaning methodologies, training configurations, and comprehensive ablation studies, will be released concurrently with the upcoming publication of the full model.