Text Generation
Transformers
Safetensors
English
llama
causal-lm
sequential-pretraining
helium
kyutai
text-generation-inference
Instructions to use kyutai/Sequential_Helium_6B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use kyutai/Sequential_Helium_6B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="kyutai/Sequential_Helium_6B")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("kyutai/Sequential_Helium_6B") model = AutoModelForCausalLM.from_pretrained("kyutai/Sequential_Helium_6B") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use kyutai/Sequential_Helium_6B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "kyutai/Sequential_Helium_6B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kyutai/Sequential_Helium_6B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/kyutai/Sequential_Helium_6B
- SGLang
How to use kyutai/Sequential_Helium_6B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "kyutai/Sequential_Helium_6B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kyutai/Sequential_Helium_6B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "kyutai/Sequential_Helium_6B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kyutai/Sequential_Helium_6B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use kyutai/Sequential_Helium_6B with Docker Model Runner:
docker model run hf.co/kyutai/Sequential_Helium_6B
| language: | |
| - en | |
| license: cc-by-sa-4.0 | |
| library_name: transformers | |
| tags: | |
| - causal-lm | |
| - sequential-pretraining | |
| - helium | |
| - kyutai | |
| datasets: | |
| - kyutai/KairosQA | |
| metrics: | |
| - accuracy | |
| # Helium 6B: Sequential vs. Shuffled Pretraining | |
| <p align="center"> | |
| <img src="https://huggingface.co/kyutai/Sequential_Helium_6B/resolve/main/kairos_seq_model.png" width="400" alt="Kairos Sequential Model Logo"> | |
| </p> | |
| This repository houses the **Helium 6B** models, specifically designed to compare **sequential pretraining** on temporally ordered data against standard **shuffled pretraining**. This research aims to understand how the order of data affects a model's ability to retain facts and minimize chronological confusion. | |
| The architecture is derived from [Helium 2B](https://huggingface.co/kyutai/helium-1-2b). | |
| ## Model Details | |
| - **Developed by:** Kyutai | |
| - **Model type:** Large Language Model (Decoder-only) | |
| - **Language(s):** Bulgarian, Czech, Danish, German, Greek, English, Spanish, Estonian, Finnish, French, Irish, Croatian, Hungarian, Italian, Lithuanian, Latvian, Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovenian, Swedish. | |
| - **License:** CC-BY-SA-4.0 | |
| - **Base Model:** Helium 2B Architecture (scaled) | |
| --- | |
| ## Uses | |
| ### Direct Use | |
| The sequential variant is engineered to improve **factuality on recent knowledge**. To support this research, we developed: | |
| * **[KairosQA](https://huggingface.co/datasets/kyutai/KairosQA):** A benchmark of 7,000+ temporally grounded questions. | |
| * **[Kairos Evaluation Code](https://github.com/kyutai-labs/kairos):** Tools to analyze how models associate facts with specific time periods. | |
| ### Out-of-Scope Use | |
| * **Instruction Following:** These are base models and have not undergone SFT or RLHF. They will not respond well to direct prompts or "chat" style interactions without further tuning. | |
| * **Multilingual:** The model should not be used in other languages than the ones on which it was trained. | |
| * **Malicious Intent:** Any illegal or harmful activity is strictly prohibited. | |
| --- | |
| ## Bias, Risks, and Limitations | |
| Helium 6B is a base model and has not been aligned with human preferences. | |
| * **Content:** It may generate biased, incorrect, or harmful content. | |
| * **Recommendation:** Do not use for downstream applications without rigorous alignment (SFT/RLHF) and risk mitigation. | |
| --- | |
| ## How to Get Started | |
| ### Loading the Base Model | |
| ```python | |
| import torch | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| model_id = "kyutai/Sequential_Helium_6B" | |
| tokenizer = AutoTokenizer.from_pretrained(model_id) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_id, | |
| torch_dtype=torch.bfloat16, | |
| device_map="auto" | |
| ) | |
| ``` | |
| ### Loading Temporal Checkpoints | |
| To access a specific stage of training (e.g., the 2024 sequential checkpoint): | |
| ```python | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_id, | |
| subfolder='sequential_2024', | |
| torch_dtype=torch.bfloat16, | |
| device_map="auto" | |
| ) | |
| ``` | |
| The list of available checkpoints is disclosed below: | |
| | Subfolder | N. Tokens | Cut-Off date | Min. date | Shuffled ? | | |
| |--------------|:------:|:------:|:------:|:------:| | |
| | | | | | | | |
| | Main ("") | 2.5T | 2025 | 2018 | no | | |
| | sequential_2024<sup>*</sup> | 2.2T | 2024 | 2018 | no | | |
| | sequential_2023<sup>*</sup> | 1.9T | 2023 | 2018 | no | | |
| | sequential_2022<sup>*</sup> | 1.6T | 2022 | 2018 | no | | |
| | sequential_2021<sup>*</sup> | 1.2T | 2021 | 2018 | no | | |
| | sequential_2020<sup>*</sup> | 0.9T | 2020 | 2018 | no | | |
| | shuffle_eq_2020 | 0.9T | 2024 | 2020 | yes | | |
| | shuffle_eq_2024 | 2.2T | 2024 | 2020 | yes | | |
| | shuffle_eq_2025 | 2.5T | 2024| 2020 | yes | | |
| <sup>*</sup> **Note on Non-Cooldown Variants:** For these specific checkpoints, we can also provide "non-cooldown" counterparts. These are extracted directly from the training process at the equivalent token count without applying a learning rate decay (cooldown phase). | |
| ## Training Details | |
| ### Training Data | |
| Helium 6B checkpoints were trained on data from Common Crawl, which was preprocessed with the [dactory](https://github.com/kyutai-labs/dactory) library. | |
| ## Evaluation | |
| #### Testing Data | |
| While our models are primarily designed to facilitate research on LLM temporality and base model dynamics—which may result in lower general performance compared to state-of-the-art models—we nonetheless evaluated them using the OLMES benchmark. This evaluation covers MMLU, ARC (Easy & Challenge), OpenBookQA, CommonSenseQA, PIQA, SIQA, HellaSwag, WinoGrande, and BoolQA. | |
| #### English Results after 2.5T training tokens | |
| | Benchmark | Sequential-Helium 6B | Shuffled-Helium 6B | | |
| |--------------|:------:|:------:| | |
| | | | | | |
| | MMLU | 59.2 | 56.9 | | |
| | ARC E | 87.7 | 86.6 | | |
| | ARC C | 74.6 | 72.3 | | |
| | OBQA | 74.0 | 72.8 | | |
| | CSQA | 73.6 | 74.2 | | |
| | PIQA | 79.9 | 80.3 | | |
| | SIQA | 66.9 | 67.6 | | |
| | HS | 78.9 | 81.2 | | |
| | WG | 73.2 | 73.3 | | |
| | BoolQA | 84.0 | 83.7 | | |
| | | | | | |
| | OLMES | 77.0 | 77.0 | | |
| ### Temporal improvements | |
| We underline in the paper [Understanding Data Temporality Impact on Large Language Models Pre-training](https://arxiv.org/abs/2605.22769) that our sequentially trained Helium 6B benefits from more up-to-date as tested on our [KairosQA](https://huggingface.co/datasets/kyutai/KairosQA) dataset. | |
| ### Licensing | |
| Helium 6B models are licensed under the CC-BY-SA 4.0 license. | |
| ## Citations | |
| If you use one of these models, please cite: | |
| ```bibtex | |
| @misc{pilchen2026understandingdatatemporalityimpact, | |
| title={Understanding Data Temporality Impact on Large Language Models Pre-training}, | |
| author={Hippolyte Pilchen and Romain Fabre and Franck Signe Talla and Patrick Perez and Edouard Grave}, | |
| year={2026}, | |
| eprint={2605.22769}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CL}, | |
| url={https://arxiv.org/abs/2605.22769}, | |
| } | |
| ``` | |