Instructions to use kyutai/Sequential_Helium_6B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use kyutai/Sequential_Helium_6B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="kyutai/Sequential_Helium_6B")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("kyutai/Sequential_Helium_6B")
model = AutoModelForCausalLM.from_pretrained("kyutai/Sequential_Helium_6B")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use kyutai/Sequential_Helium_6B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "kyutai/Sequential_Helium_6B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kyutai/Sequential_Helium_6B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/kyutai/Sequential_Helium_6B

SGLang

How to use kyutai/Sequential_Helium_6B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "kyutai/Sequential_Helium_6B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kyutai/Sequential_Helium_6B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "kyutai/Sequential_Helium_6B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kyutai/Sequential_Helium_6B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use kyutai/Sequential_Helium_6B with Docker Model Runner:
```
docker model run hf.co/kyutai/Sequential_Helium_6B
```

HippolyteP commited on Feb 3

Commit

bf3c992

1 Parent(s): e1e2cee

Update README.md

Browse files

Files changed (1) hide show

README.md +80 -9

README.md CHANGED Viewed

@@ -11,28 +11,99 @@ license: cc-by-sa-4.0
  This page houses `Helium 6B` models trained using either a sequential pretraining on temporally ordered data or using a standard pretraining on shuffled data. The architecture is derived from [Helium 2B](https://huggingface.co/kyutai/helium-1-2b).
- ## Models Details
- ### Uses
- As described in the [paper](),
- ### Licensing
-Helium 6B models are licensed under the CC-BY-SA 4.0 license.
- ## Usage
- ```python
 ```
  ## Citations
  If you use one of these models, please cite:

  This page houses `Helium 6B` models trained using either a sequential pretraining on temporally ordered data or using a standard pretraining on shuffled data. The architecture is derived from [Helium 2B](https://huggingface.co/kyutai/helium-1-2b).
+- **Developed by:** Kyutai
+- **Model type:** Large Language Model
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+The intended use of the Helium model is research and development of natural language processing systems, including but not limited to language generation and understanding.
+For most downstream use cases, the model should be aligned with supervised fine-tuning, RLHF or related methods.
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+The model should not be used in other languages than the ones on which it was trained.
+The model is not intended to be used for any malicious or illegal activities of any kind.
+The model was not fine-tuned to follow instructions, and thus should not be used as such.
+## Bias, Risks, and Limitations
+Helium-1 is a base language model, which was not aligned to human preferences.
+As such, the model can generate incorrect, biased, harmful or generally unhelpful content.
+Thus, the model should not be used for downstream applications without further alignment, evaluations and mitigations of risks.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_id = "kyutai/Sequential_Helium_6B"
+model = AutoModelForCausalLM.from_pretrained(model_id).cuda()
+tokenizer = AutoTokenizer.from_pretrained(model_id)
 ```
+To load a specific checkpoint, e.g. the last checkpoint from the sequential pretraining (cool-downed) before any 2025 data:
+```python
+model = AutoModelForCausalLM.from_pretrained(model_id, subfolder='sequential_2024').cuda()
+```
+## Training Details
+### Training Data
+Helium-6B checkpoints were trained on data from Common Crawl, which was preprocessed with the [dactory](https://github.com/kyutai-labs/dactory) library.
+## Evaluation
+#### Testing Data
+The model was evaluated using [OLMES](https://arxiv.org/abs/2406.08446) a LLM evaluation benchmark based on, MMLU, ARC Easy & Challenge, Open Book QA, Common Sense QA, Physical Interaction QA, Social Interaction QA, HellaSwag, WinoGrande and BoolQA.
+#### English Results
+| Benchmark | Sequential-Helium-6B | Shuffled-Helium-6B (2.5T tokens) |
+|--------------|:------:|:------:|
+| | | | | | |
+| MMLU | 58.8 | 56.4 |
+| ARC E | 87.6 | 86.7 |
+| ARC C | 74.5 | 72.1 |
+| OBQA | 72.8 | 73.2 |
+| CSQA | 73.1 | 74.3 |
+| PIQA | 80.3 | 80.2 |
+| SIQA | 67.0 | 66.2 |
+| HS | 79.1 | 81.3 |
+| WG | 73.0 | 73.1 |
+| BoolQA | 83.9 | 83.9 |
+| | | |
+| OLMES | 75.0 | 74.7 |
+ ### Uses
+ As described in the [paper](),
+ ### Licensing
+Helium 6B models are licensed under the CC-BY-SA 4.0 license.
  ## Citations
  If you use one of these models, please cite: