Time Series Forecasting
Transformers
Safetensors
Timer-S1
text-generation
time series
time-series
forecasting
foundation models
pretrained models
time series foundation models
quantized
4-bit precision
bitsandbytes
unofficial
custom_code
Instructions to use geetu040/Timer-S1-quantized-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use geetu040/Timer-S1-quantized-4bit with Transformers:
# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("geetu040/Timer-S1-quantized-4bit", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Add files using upload-large-folder tool
Browse files
README.md
CHANGED
|
@@ -55,12 +55,12 @@ The checkpoint configuration records the following quantization settings:
|
|
| 55 |
"quant_method": "bitsandbytes",
|
| 56 |
"bnb_4bit_quant_type": "fp4",
|
| 57 |
"bnb_4bit_quant_storage": "uint8",
|
| 58 |
-
"bnb_4bit_compute_dtype": "
|
| 59 |
"bnb_4bit_use_double_quant": false
|
| 60 |
}
|
| 61 |
```
|
| 62 |
|
| 63 |
-
The model config also sets `use_cache` to `
|
| 64 |
|
| 65 |
## Quickstart
|
| 66 |
|
|
@@ -82,6 +82,10 @@ model = AutoModelForCausalLM.from_pretrained(
|
|
| 82 |
device_map="auto",
|
| 83 |
)
|
| 84 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 85 |
batch_size, lookback_length = 1, 2880
|
| 86 |
seqs = torch.randn(batch_size, lookback_length).to(model.device)
|
| 87 |
|
|
|
|
| 55 |
"quant_method": "bitsandbytes",
|
| 56 |
"bnb_4bit_quant_type": "fp4",
|
| 57 |
"bnb_4bit_quant_storage": "uint8",
|
| 58 |
+
"bnb_4bit_compute_dtype": "bfloat16",
|
| 59 |
"bnb_4bit_use_double_quant": false
|
| 60 |
}
|
| 61 |
```
|
| 62 |
|
| 63 |
+
The model config also sets `use_cache` to `true`, matching the local quantized checkpoint. For lower memory usage during generation, set `model.config.use_cache = False` after loading the model.
|
| 64 |
|
| 65 |
## Quickstart
|
| 66 |
|
|
|
|
| 82 |
device_map="auto",
|
| 83 |
)
|
| 84 |
|
| 85 |
+
# Optional: reduce generation memory usage by disabling the KV cache.
|
| 86 |
+
# This can be useful on smaller GPUs or for longer lookback windows.
|
| 87 |
+
model.config.use_cache = False
|
| 88 |
+
|
| 89 |
batch_size, lookback_length = 1, 2880
|
| 90 |
seqs = torch.randn(batch_size, lookback_length).to(model.device)
|
| 91 |
|