| --- |
| tags: |
| - model_hub_mixin |
| - pytorch_model_hub_mixin |
| --- |
| |
| # CleanTS Model Card |
|
|
| ## Introduction |
|
|
| CleanTS is a competitive pre-trained baseline for time-series forecasting. It is designed to demonstrate the potential of **data-centric optimization** using a **standard encoder-only Transformer** architecture without any structural modifications. |
|
|
| ## Key Technical Principles |
|
|
| ### 1. Pure Architecture |
|
|
| Zero modifications to the attention mechanism or model structure. CleanTS proves that high-quality data governance can significantly elevate the performance of vanilla architectures. |
|
|
| ### 2. Systematic Data Governance |
|
|
| CleanTS adheres to a strict data-centric pipeline: |
|
|
| - **Zero Synthetic Data:** All training is performed exclusively on **authentic, real-world data**. |
| - **Publicly Sourced:** The training corpus consists entirely of **publicly available datasets**, ensuring transparency and accessibility. |
| - **Advanced Cleaning:** We achieve promising results solely through systematic data cleaning and preprocessing strategies. |
|
|
| ### 3. Contamination Prevention |
|
|
| We strictly guarantee that **no data from the GiftEval test set,Fev-bench,Fev-leaderboard and LSTF** was involved in the training phase. |
|
|
| > [!NOTE] A **detailed technical report**, including our specific data cleaning methodologies, training configurations, and comprehensive ablation studies, will be released concurrently with the upcoming publication of the full model. |