Thoth: Mid-Training Bridges LLMs to Time Series Understanding

Paper GitHub Repo Hugging Face

πŸ“„ Introduction

While Large Language Models (LLMs) demonstrate exceptional proficiency in general reasoning, they often exhibit a fundamental limitation in capturing intricate temporal dependencies. To bridge this gap, Thoth introduces the first family of mid-trained LLMs that transcend the constraints of task-specific Supervised Fine-Tuning (SFT) through a task- and domain-agnostic mid-training stage. By leveraging an automated synthesis pipeline to achieve bidirectional alignment between time-series-to-text and text-to-time-series generation, Thoth equips models with an intrinsic and foundational understanding of temporal dynamics. This internalized comprehension enables the model to effectively address and enhance performance across a wide range of complex, knowledge-intensive time series reasoning downstream tasks in real-world scenarios.

intro

Thoth-30B-A3B is a full-parameter fine-tuned version based on the Qwen3-30B-A3B-Instruct-2507. For more details, please check our paper.

✨ Quickstart

pip install transformers==4.57.1
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "thuml/Thoth-30B-A3B"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", dtype=torch.bfloat16, trust_remote_code=True).eval()

# A simple time series anomaly detection task
question = """The following data represents the hourly electricity consumption (in kWh) of an office building over a 24-hour period, starting from midnight (00:00).
Data: [12.5, 11.8, 12.1, 11.5, 12.2, 11.9, 15.6, 32.4, 35.1, 34.8, 36.2, 65.5, 37.0, 35.5, 34.2, 33.9, 35.1, 31.8, 18.2, 14.5, 13.1, 12.8, 12.4, 11.9]
Task: 1. Specify the hour (0-23) when the anomaly occurs. 2. Provide a brief reasoning why you consider it an anomaly."""

messages = [
    {"role": "system", "content": "You are an expert in time series understanding and reasoning."},
    {"role": "user", "content": question}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Generate reasoning output
generated_ids = model.generate(**model_inputs, max_new_tokens=512, temperature=0.7)

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

For detailed evaluation, please visit our GitHub repository: https://github.com/thuml/Thoth.

πŸš€ Release Progress

  • Thoth-30B-A3B model weights
  • public benchmark evaluation pipeline
  • KnoTS benchmark
  • KnoTS evaluation code

πŸ“œ Citation

If you find our work useful, please cite our paper as:

@article{lin2026thoth,
  title={Thoth: Mid-Training Bridges LLMs to Time Series Understanding},
  author={Lin, Jiafeng and Wang, Yuxuan and Wu, Jialong and Luo, Huakun and Pei, Zhongyi and Wang, Jianmin},
  journal={arXiv preprint arXiv:2603.01042},
  year={2026}
}

🀝 Contact

If you have any questions, feel free to contact:

πŸ’‘ Acknowledgment

We sincerely appreciate the following works for their valuable open-source models and evaluation benchmarks: Qwen3, Time-MQA, ChatTime, ChatTS.

Downloads last month
27
Safetensors
Model size
31B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for thuml/Thoth-30B-A3B

Finetuned
(45)
this model

Paper for thuml/Thoth-30B-A3B