Upload README.md
Browse files
README.md
CHANGED
|
@@ -4,7 +4,28 @@ tags:
|
|
| 4 |
- pytorch_model_hub_mixin
|
| 5 |
---
|
| 6 |
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
- pytorch_model_hub_mixin
|
| 5 |
---
|
| 6 |
|
| 7 |
+
# CleanTS Model Card
|
| 8 |
+
|
| 9 |
+
## Introduction
|
| 10 |
+
|
| 11 |
+
CleanTS is a competitive pre-trained baseline for time-series forecasting. It is designed to demonstrate the potential of **data-centric optimization** using a **standard encoder-only Transformer** architecture without any structural modifications.
|
| 12 |
+
|
| 13 |
+
## Key Technical Principles
|
| 14 |
+
|
| 15 |
+
### 1. Pure Architecture
|
| 16 |
+
|
| 17 |
+
Zero modifications to the attention mechanism or model structure. CleanTS proves that high-quality data governance can significantly elevate the performance of vanilla architectures.
|
| 18 |
+
|
| 19 |
+
### 2. Systematic Data Governance
|
| 20 |
+
|
| 21 |
+
CleanTS adheres to a strict data-centric pipeline:
|
| 22 |
+
|
| 23 |
+
- **Zero Synthetic Data:** All training is performed exclusively on **authentic, real-world data**.
|
| 24 |
+
- **Publicly Sourced:** The training corpus consists entirely of **publicly available datasets**, ensuring transparency and accessibility.
|
| 25 |
+
- **Advanced Cleaning:** We achieve promising results solely through systematic data cleaning and preprocessing strategies.
|
| 26 |
+
|
| 27 |
+
### 3. Contamination Prevention
|
| 28 |
+
|
| 29 |
+
We strictly guarantee that **no data from the GiftEval test set,Fev-bench,Fev-leaderboard and LSTF** was involved in the training phase.
|
| 30 |
+
|
| 31 |
+
> [!NOTE] A **detailed technical report**, including our specific data cleaning methodologies, training configurations, and comprehensive ablation studies, will be released concurrently with the upcoming publication of the full model.
|