Instructions to use NLPnorth/snakmodel-7b-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use NLPnorth/snakmodel-7b-instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="NLPnorth/snakmodel-7b-instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("NLPnorth/snakmodel-7b-instruct")
model = AutoModelForCausalLM.from_pretrained("NLPnorth/snakmodel-7b-instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use NLPnorth/snakmodel-7b-instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "NLPnorth/snakmodel-7b-instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NLPnorth/snakmodel-7b-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/NLPnorth/snakmodel-7b-instruct

SGLang

How to use NLPnorth/snakmodel-7b-instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "NLPnorth/snakmodel-7b-instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NLPnorth/snakmodel-7b-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "NLPnorth/snakmodel-7b-instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NLPnorth/snakmodel-7b-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use NLPnorth/snakmodel-7b-instruct with Docker Model Runner:
```
docker model run hf.co/NLPnorth/snakmodel-7b-instruct
```

Max Müller-Eberstein commited on Oct 17, 2024

Commit

2eca9d8

1 Parent(s): a90dea5

added initial model card

Browse files

Files changed (2) hide show

README.md +87 -3
snakmodel.png +0 -0

README.md CHANGED Viewed

@@ -1,3 +1,87 @@
----
-license: llama2
----

+---
+license: llama2
+language:
+- da
+base_model:
+- meta-llama/Llama-2-7b-hf
+pipeline_tag: text-generation
+---
+![SnakModel Instruct Logo](snakmodel.png)
+## Model Details
+**SnakModel** is a 7B-parameter model specifically designed for the Danish language. This is the instruction-tuned variant: `SnakModel-7B (instruct)`. Our models builds upon [Llama 2](https://huggingface.co/meta-llama/Llama-2-7b-hf), which we continuously pre-train on a diverse collection of Danish corpora comprising 350M documents and 13.6B words, before tuning it on 3.7M Danish instruction-answer pairs.
+**Model Developers**
+[NLPnorth research unit](https://nlpnorth.github.io) at the [IT University of Copenhagen](https://itu.dk), Denmark
+**Variations**
+SnakModel comes as an instruction-tuned, and a base version. In addition, each model includes intermediate checkpoints (under model revisions).
+**Input**
+Text only.
+**Output**
+Text only.
+**Model Architecture**
+SnakModel is an auto-regressive, transformer-based language model. The `instruct` version uses supervised fine-tuning (SFT) to enable instruction following in Danish.
+**Model Dates**
+SnakModel was trained between January 2024 and September 2024.
+**License**
+This model follows the original [Llama 2 license agreement](https://huggingface.co/meta-llama/Llama-2-7b-hf/blob/main/LICENSE.txt).
+**Research Paper**
+[Released in Q1 2025]
+## Intended Use & Limitations
+**Intended Use Cases**
+SnakModel is intended for use in Danish. The instruction-tuned variant is intended for assistant-like chat.
+The `instruct` variant follows the Llama 2 (chat) instruction template, in which instructions are encapsulated in special tokens, i.e., `[INST] {instruction} [/INST]`.
+**Limitations**
+SnakModel variants are fine-tuned on Danish data. As such, the use in other languages falls out-of-scope. While we found SnakModel to be more proficient in Danish, than other Llama 2-based models, it still frequently generates factually incorrect output. Make sure to carefully evaluate and weigh these factors before deploying the model. In addition, make sure to adhere to the original [Llama 2 license agreement](https://huggingface.co/meta-llama/Llama-2-7b-hf/blob/main/LICENSE.txt).
+## Hardware and Software
+**Training Factors**
+SnakModel is trained on private infrastructure with one node, containing four NVIDIA A100-PCIe 40GB GPUs. The node has an AMD Epyc 7662 128 Core Processor and 1TB of RAM.
+**Carbon Footprint**
+Total training time accounted to 8,928 GPU hours, with an average carbon efficiency at 0.122kg CO2eq / kWh. This is equivalent to 272.3kg CO2eq emitted, based on the [Machine Learning Impact calculator](https://mlco2.github.io/impact).
+## Training Data
+**Overview**
+SnakModel was continuously pre-train on a diverse collection of Danish corpora comprising 350M documents and 13.6B words. The `instruct` version is further tuned on 3.7M Danish instruction-answer pairs.
+[Details to follow in Q1 2025]
+**Data Freshness**
+The pre-training data has a cutoff of January 2024.
+## Evaluation Results
+[Released in Q1 2025]
+## Citation
+[Released in Q1 2025]

snakmodel.png ADDED Viewed