geetu040's picture
Add files using upload-large-folder tool
f3a92fc verified
---
license: apache-2.0
metrics:
- mse
- mae
- mase
- wql
- crps
pipeline_tag: time-series-forecasting
datasets:
- thuml/UTSD
- Salesforce/lotsa_data
- Salesforce/GiftEvalPretrain
- autogluon/chronos_datasets
tags:
- time series
- time-series
- forecasting
- foundation models
- pretrained models
- time series foundation models
- quantized
- 4-bit
- bitsandbytes
- unofficial
library_name: transformers
base_model:
- bytedance-research/Timer-S1
---
# Timer-S1 Quantized 4-bit
This repository contains an **unofficial 4-bit BitsAndBytes quantized checkpoint** derived from [`bytedance-research/Timer-S1`](https://huggingface.co/bytedance-research/Timer-S1).
Timer-S1 is a time series foundation model for zero-shot forecasting. The original model card describes Timer-S1 as a decoder-only Mixture-of-Experts Transformer with **8.3B** total parameters, **0.75B** activated parameters per token, and a context length of **11,520**. For details about the original model, architecture, training data, benchmark results, and intended use, refer to the upstream model card and the [Timer-S1 technical report](https://arxiv.org/pdf/2603.04791).
This upload preserves the upstream Timer-S1 remote-code implementation files and Apache-2.0 license metadata, but stores the model weights as a local 4-bit quantized checkpoint for lower-memory inference.
## Source and Provenance
- **Base model**: `bytedance-research/Timer-S1`
- **Quantization**: BitsAndBytes 4-bit quantization
- **Status**: unofficial derivative checkpoint
No new training or benchmark claims are made for this quantized checkpoint. Numerical outputs may differ slightly from the original bfloat16 checkpoint because the weights are quantized.
## Quantization Details
The checkpoint configuration records the following quantization settings:
```json
{
"load_in_4bit": true,
"load_in_8bit": false,
"quant_method": "bitsandbytes",
"bnb_4bit_quant_type": "fp4",
"bnb_4bit_quant_storage": "uint8",
"bnb_4bit_compute_dtype": "bfloat16",
"bnb_4bit_use_double_quant": false
}
```
The model config also sets `use_cache` to `true`, matching the local quantized checkpoint. For lower memory usage during generation, set `model.config.use_cache = False` after loading the model.
## Quickstart
Install the expected runtime dependencies:
```bash
pip install torch accelerate bitsandbytes "transformers~=4.57.1"
```
Load the model with Hugging Face Transformers:
```python
import torch
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
"geetu040/Timer-S1-quantized-4bit",
trust_remote_code=True,
device_map="auto",
)
# Optional: reduce generation memory usage by disabling the KV cache.
# This can be useful on smaller GPUs or for longer lookback windows.
model.config.use_cache = False
batch_size, lookback_length = 1, 2880
seqs = torch.randn(batch_size, lookback_length).to(model.device)
forecast_length = 256
output = model.generate(seqs, max_new_tokens=forecast_length, revin=True)
# Timer-S1 generates forecasts at quantile levels:
# [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
print(output.shape) # batch_size x quantile_num(9) x forecast_length
print(output[0][4]) # median forecast for the first sample
```
## Specification
- **Architecture**: decoder-only Transformer with MoE
- **Context length**: up to 11,520
- **Patch length**: 16
- **Quantiles**: 0.1 through 0.9
- **Hidden size**: 1024
- **Attention heads**: 16
- **Experts**: 32 total, 2 selected per token
- **Hidden layers**: 24
- **Weight format**: `model.safetensors`
- **Quantization**: BitsAndBytes 4-bit FP4
## License
The upstream Timer-S1 model card lists the model under the Apache-2.0 License. This repository preserves that license metadata.
## Citation
If you use this quantized checkpoint, cite the original Timer-S1 paper:
```bibtex
@article{liu2026timer,
title={Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling},
author={Liu, Yong and Su, Xingjian and Wang, Shiyu and Zhang, Haoran and Liu, Haixuan and Wang, Yuxuan and Ye, Zhou and Xiang, Yang and Wang, Jianmin and Long, Mingsheng},
journal={arXiv preprint arXiv:2603.04791},
year={2026}
}
```