Instructions to use Finnish-NLP/Ahma-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Finnish-NLP/Ahma-7B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Finnish-NLP/Ahma-7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Finnish-NLP/Ahma-7B")
model = AutoModelForCausalLM.from_pretrained("Finnish-NLP/Ahma-7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Finnish-NLP/Ahma-7B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Finnish-NLP/Ahma-7B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Finnish-NLP/Ahma-7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Finnish-NLP/Ahma-7B

SGLang

How to use Finnish-NLP/Ahma-7B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Finnish-NLP/Ahma-7B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Finnish-NLP/Ahma-7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Finnish-NLP/Ahma-7B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Finnish-NLP/Ahma-7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Finnish-NLP/Ahma-7B with Docker Model Runner:
```
docker model run hf.co/Finnish-NLP/Ahma-7B
```

Ahma-7B

Commit History

Improve model card: Add `transformers` library, link paper, include abstract (#1)

5e71955
verified

RASMUS

nielsr HF Staff commited on Jul 3, 2025

Update README.md

ee9147f
verified

aapot commited on Dec 30, 2024

Update README.md

fcb4832
verified

aapot commited on Sep 18, 2024

Upload ahma.jpg

aa79db2
verified

aapot commited on Sep 18, 2024

Add 2-stage model

ba60e40

aapotanskanen commited on Sep 18, 2024

fix autotokenizer

f2e7d96
verified

aapot commited on Jun 4, 2024

Update README.md

dfc3629
verified

aapot commited on Jun 3, 2024

Add chat template tokenizer

7904bf0
verified

aapot commited on May 28, 2024

Update optimizers

947b4f4

aapot commited on May 25, 2024

Add 900k step model

5f6eb94

aapot commited on May 7, 2024

Add 800k step model

bc2d607

aapot commited on Apr 30, 2024

Add 700k step model

69352a4

aapot commited on Apr 27, 2024

Add 600k step model

760ab5f

aapot commited on Apr 17, 2024

Add 500k step model

32ea3d3

aapot commited on Apr 14, 2024

Add 400k step model

d1256a6

aapot commited on Apr 6, 2024

Add 300k step model

1e3094c

aapot commited on Mar 26, 2024

Update README.md

f4d1ca9
verified

aapot commited on Mar 19, 2024

Add 200k model

a4afd13

aapot commited on Mar 19, 2024

fix

daa4cb1

aapot commited on Mar 13, 2024

fix

d2f0aae

aapot commited on Mar 13, 2024

Update train script

6c26cc3

aapot commited on Mar 5, 2024

Add 100k step model

245fad2

aapot commited on Feb 19, 2024

Add 50k model

6d1b645

aapot commited on Feb 15, 2024

Add training codes

a85f909

aapot commited on Feb 10, 2024

initial commit

20e722f
verified

aapot commited on Feb 10, 2024

Commit History

Improve model card: Add `transformers` library, link paper, include abstract (#1) 5e71955 verified

Update README.md ee9147f verified

Update README.md fcb4832 verified

Upload ahma.jpg aa79db2 verified

Add 2-stage model ba60e40

fix autotokenizer f2e7d96 verified

Update README.md dfc3629 verified

Add chat template tokenizer 7904bf0 verified

Update optimizers 947b4f4

Add 900k step model 5f6eb94

Add 800k step model bc2d607

Add 700k step model 69352a4

Add 600k step model 760ab5f

Add 500k step model 32ea3d3

Add 400k step model d1256a6

Add 300k step model 1e3094c

Update README.md f4d1ca9 verified

Add 200k model a4afd13

fix daa4cb1

fix d2f0aae

Update train script 6c26cc3

Add 100k step model 245fad2

Add 50k model 6d1b645

Add training codes a85f909

initial commit 20e722f verified