Add files using upload-large-folder tool

f3a92fc verified 3 days ago

4.21 kB

	---
	license: apache-2.0
	metrics:
	- mse
	- mae
	- mase
	- wql
	- crps
	pipeline_tag: time-series-forecasting
	datasets:
	- thuml/UTSD
	- Salesforce/lotsa_data
	- Salesforce/GiftEvalPretrain
	- autogluon/chronos_datasets
	tags:
	- time series
	- time-series
	- forecasting
	- foundation models
	- pretrained models
	- time series foundation models
	- quantized
	- 4-bit
	- bitsandbytes
	- unofficial
	library_name: transformers
	base_model:
	- bytedance-research/Timer-S1
	---

	# Timer-S1 Quantized 4-bit

	This repository contains an unofficial 4-bit BitsAndBytes quantized checkpoint derived from [`bytedance-research/Timer-S1`](https://huggingface.co/bytedance-research/Timer-S1).

	Timer-S1 is a time series foundation model for zero-shot forecasting. The original model card describes Timer-S1 as a decoder-only Mixture-of-Experts Transformer with 8.3B total parameters, 0.75B activated parameters per token, and a context length of 11,520. For details about the original model, architecture, training data, benchmark results, and intended use, refer to the upstream model card and the [Timer-S1 technical report](https://arxiv.org/pdf/2603.04791).

	This upload preserves the upstream Timer-S1 remote-code implementation files and Apache-2.0 license metadata, but stores the model weights as a local 4-bit quantized checkpoint for lower-memory inference.

	## Source and Provenance

	- Base model: `bytedance-research/Timer-S1`
	- Quantization: BitsAndBytes 4-bit quantization
	- Status: unofficial derivative checkpoint

	No new training or benchmark claims are made for this quantized checkpoint. Numerical outputs may differ slightly from the original bfloat16 checkpoint because the weights are quantized.

	## Quantization Details

	The checkpoint configuration records the following quantization settings:

	```json
	{
	"load_in_4bit": true,
	"load_in_8bit": false,
	"quant_method": "bitsandbytes",
	"bnb_4bit_quant_type": "fp4",
	"bnb_4bit_quant_storage": "uint8",
	"bnb_4bit_compute_dtype": "bfloat16",
	"bnb_4bit_use_double_quant": false
	}
	```

	The model config also sets `use_cache` to `true`, matching the local quantized checkpoint. For lower memory usage during generation, set `model.config.use_cache = False` after loading the model.

	## Quickstart

	Install the expected runtime dependencies:

	```bash
	pip install torch accelerate bitsandbytes "transformers~=4.57.1"
	```

	Load the model with Hugging Face Transformers:

	```python
	import torch
	from transformers import AutoModelForCausalLM

	model = AutoModelForCausalLM.from_pretrained(
	"geetu040/Timer-S1-quantized-4bit",
	trust_remote_code=True,
	device_map="auto",
	)

	# Optional: reduce generation memory usage by disabling the KV cache.
	# This can be useful on smaller GPUs or for longer lookback windows.
	model.config.use_cache = False

	batch_size, lookback_length = 1, 2880
	seqs = torch.randn(batch_size, lookback_length).to(model.device)

	forecast_length = 256
	output = model.generate(seqs, max_new_tokens=forecast_length, revin=True)

	# Timer-S1 generates forecasts at quantile levels:
	# [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
	print(output.shape) # batch_size x quantile_num(9) x forecast_length
	print(output[0][4]) # median forecast for the first sample
	```

	## Specification

	- Architecture: decoder-only Transformer with MoE
	- Context length: up to 11,520
	- Patch length: 16
	- Quantiles: 0.1 through 0.9
	- Hidden size: 1024
	- Attention heads: 16
	- Experts: 32 total, 2 selected per token
	- Hidden layers: 24
	- Weight format: `model.safetensors`
	- Quantization: BitsAndBytes 4-bit FP4

	## License

	The upstream Timer-S1 model card lists the model under the Apache-2.0 License. This repository preserves that license metadata.

	## Citation

	If you use this quantized checkpoint, cite the original Timer-S1 paper:

	```bibtex
	@article{liu2026timer,
	title={Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling},
	author={Liu, Yong and Su, Xingjian and Wang, Shiyu and Zhang, Haoran and Liu, Haixuan and Wang, Yuxuan and Ye, Zhou and Xiang, Yang and Wang, Jianmin and Long, Mingsheng},
	journal={arXiv preprint arXiv:2603.04791},
	year={2026}
	}
	```