Upload README.md

1df5937 verified 5 months ago

1.45 kB

tags:
  - model_hub_mixin
  - pytorch_model_hub_mixin

CleanTS Model Card

Introduction

CleanTS is a competitive pre-trained baseline for time-series forecasting. It is designed to demonstrate the potential of data-centric optimization using a standard encoder-only Transformer architecture without any structural modifications.

Key Technical Principles

1. Pure Architecture

Zero modifications to the attention mechanism or model structure. CleanTS proves that high-quality data governance can significantly elevate the performance of vanilla architectures.

2. Systematic Data Governance

CleanTS adheres to a strict data-centric pipeline:

Zero Synthetic Data: All training is performed exclusively on authentic, real-world data.
Publicly Sourced: The training corpus consists entirely of publicly available datasets, ensuring transparency and accessibility.
Advanced Cleaning: We achieve promising results solely through systematic data cleaning and preprocessing strategies.

3. Contamination Prevention

We strictly guarantee that no data from the GiftEval test set，Fev-bench，Fev-leaderboard and LSTF was involved in the training phase.

A detailed technical report, including our specific data cleaning methodologies, training configurations, and comprehensive ablation studies, will be released concurrently with the upcoming publication of the full model.