Update README.md
Browse files
README.md
CHANGED
|
@@ -72,7 +72,7 @@ model = AutoModelForCausalLM.from_pretrained(
|
|
| 72 |
# )
|
| 73 |
|
| 74 |
# prepare input
|
| 75 |
-
batch_size, lookback_length =
|
| 76 |
seqs = torch.randn(batch_size, lookback_length).to(model.device)
|
| 77 |
|
| 78 |
# Note that Timer-S1 generates predictions at fixed quantile levels
|
|
@@ -87,18 +87,19 @@ print(output.shape) # batch_size x quantile_num(9) x forecast_length
|
|
| 87 |
print(output[0][4])
|
| 88 |
```
|
| 89 |
|
| 90 |
-
|
|
|
|
| 91 |
> ```python
|
| 92 |
> # Option 1: reduce batch size or context length
|
| 93 |
> batch_size, lookback_length = 1, 2880
|
| 94 |
>
|
| 95 |
> # Option 2: disable KV cache at runtime (or edit it in config.json for a permanent change)
|
| 96 |
-
> model.config.use_cache = False
|
| 97 |
> ```
|
| 98 |
|
| 99 |
## Specification
|
| 100 |
|
| 101 |
-
* **Architecture**: decoder-only Transformer
|
| 102 |
* **Context Length**: up to 11,520
|
| 103 |
* **ReNorm**: default=True
|
| 104 |
* **KV Cache**: default=True
|
|
|
|
| 72 |
# )
|
| 73 |
|
| 74 |
# prepare input
|
| 75 |
+
batch_size, lookback_length = 64, 11520
|
| 76 |
seqs = torch.randn(batch_size, lookback_length).to(model.device)
|
| 77 |
|
| 78 |
# Note that Timer-S1 generates predictions at fixed quantile levels
|
|
|
|
| 87 |
print(output[0][4])
|
| 88 |
```
|
| 89 |
|
| 90 |
+
|
| 91 |
+
> This model support inference using either CPU or GPU. To load this model on GPU, we recommend a GPU with **at least 40GB VRAM** (e.g., A100 40GB/80GB, or H100). **Encounter out-of-memory at runtime?** Try the following options:
|
| 92 |
> ```python
|
| 93 |
> # Option 1: reduce batch size or context length
|
| 94 |
> batch_size, lookback_length = 1, 2880
|
| 95 |
>
|
| 96 |
> # Option 2: disable KV cache at runtime (or edit it in config.json for a permanent change)
|
| 97 |
+
> model.config.use_cache = False # there is no efficiency impact for cases where the prediction horizon does not exceed 256.
|
| 98 |
> ```
|
| 99 |
|
| 100 |
## Specification
|
| 101 |
|
| 102 |
+
* **Architecture**: decoder-only Transformer with MoE
|
| 103 |
* **Context Length**: up to 11,520
|
| 104 |
* **ReNorm**: default=True
|
| 105 |
* **KV Cache**: default=True
|