Ill-Ness commited on
Commit
3455ee2
·
verified ·
1 Parent(s): 5cca1e2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +97 -0
README.md ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Chorous1
2
+
3
+ <p align="center">
4
+ <img src="https://img.shields.io/badge/Parameters-100M%20%7C%2050M%20%7C%2027M-brightgreen?style=flat-square" />
5
+ <img src="https://img.shields.io/badge/Architecture-Patch--Transformer-purple?style=flat-square" />
6
+ <img src="https://img.shields.io/badge/License-8f--ai--license--v1.0-red?style=flat-square" />
7
+ </p>
8
+
9
+ > **Chorous1** is a suite of three high-performance, patch-based transformer models for multivariate time-series forecasting. Combining RevIN, MAE-style patch masking, and a Flatten Head architecture, Chorous1 delivers state-of-the-art accuracy on real-world benchmark data.
10
+
11
+ ---
12
+
13
+ ## Table of Contents
14
+
15
+ - [Model Variants](#model-variants)
16
+ - [Architecture](#architecture)
17
+ - [Quickstart](#quickstart)
18
+ - [Performance](#performance)
19
+ - [Limitations](#limitations)
20
+ - [License](#license)
21
+
22
+ ---
23
+
24
+ ## Model Variants
25
+
26
+ | Variant | Parameters | Hidden Size | Layers | Query Heads / KV Heads |
27
+ |---|---|---|---|---|
28
+ | `chorous1-100m` | ~100M | 768 | 12 | 12 / 4 |
29
+ | `chorous1-50m` | ~50M | 512 | 16 | 8 / 2 |
30
+ | `chorous1-27m` | ~27M | 384 | 16 | 6 / 2 |
31
+
32
+ ---
33
+
34
+ ## Architecture
35
+
36
+ | Component | Specification |
37
+ |---|---|
38
+ | Context Length | 512 steps |
39
+ | Forecast Horizon | 96 steps |
40
+ | Patch Size | 16 (non-overlapping) |
41
+ | Number of Patches | 32 |
42
+ | FFN Multiplier | 2.667× |
43
+ | Activation | SwiGLU |
44
+ | Positional Encoding | RoPE (θ = 500,000) |
45
+ | Normalization | RMSNorm |
46
+ | Masking Ratio | 25% (training only) |
47
+ | Loss Function | Huber Loss + MAE |
48
+ | Precision | bfloat16 |
49
+
50
+ ### How It Works
51
+
52
+ **Stage 1 — Neural Encoding.** The transformer encoder processes patches of time-series data using RoPE and GQA to capture long-range temporal dependencies and periodic structure.
53
+
54
+ **Stage 2 — RevIN Normalization.** A reversible instance normalization layer removes mean and variance shifts from the input prior to processing, then restores them on the output — eliminating the distribution mismatch problem common in real-world deployments.
55
+
56
+ ---
57
+
58
+ ## Quickstart
59
+
60
+ ```python
61
+ import torch
62
+ from safetensors.torch import load_file
63
+
64
+ # Replace "100m" with "50m" or "27m" as needed
65
+ weights = load_file("./chorous_checkpoint/100m/model.safetensors")
66
+ model.load_state_dict(weights)
67
+ model.eval()
68
+
69
+ # Input shape: [Batch, Channels, Time]
70
+ x = torch.randn(1, 7, 512)
71
+
72
+ with torch.no_grad():
73
+ forecast = model(x) # Output shape: [1, 7, 96]
74
+ ```
75
+
76
+ ---
77
+
78
+ ## Performance
79
+
80
+ | Metric | `chorous1-100m` | `chorous1-50m` | `chorous1-27m` |
81
+ |---|---|---|---|
82
+ | Weights Size | ~200 MB | ~110 MB | ~65 MB |
83
+ | VRAM (Inference) | ~12 GB | ~8 GB | ~6 GB |
84
+
85
+ ---
86
+
87
+ ## Limitations
88
+
89
+ - **Fixed Forecast Horizon** — Optimized for 96-step forecasting. Modifying the output head for longer horizons may reduce accuracy.
90
+ - **Channel Count Constraint** — The RevIN layer is initialized using the maximum channel count from the training suite. Inputs exceeding this limit are not supported out of the box.
91
+ - **Patch Alignment Requirement** — Input context length must be an exact multiple of the patch size (16).
92
+
93
+ ---
94
+
95
+ ## License
96
+
97
+ Chorous1 is released under the [8f-ai-license-v1.0](https://huggingface.co/8Fai/license). Please review the full terms before use in production or commercial applications.