Text Generation
Transformers
Safetensors
Finnish
llama
finnish
conversational
text-generation-inference
Instructions to use Finnish-NLP/Ahma-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Finnish-NLP/Ahma-7B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Finnish-NLP/Ahma-7B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Finnish-NLP/Ahma-7B") model = AutoModelForCausalLM.from_pretrained("Finnish-NLP/Ahma-7B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Finnish-NLP/Ahma-7B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Finnish-NLP/Ahma-7B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Finnish-NLP/Ahma-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Finnish-NLP/Ahma-7B
- SGLang
How to use Finnish-NLP/Ahma-7B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Finnish-NLP/Ahma-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Finnish-NLP/Ahma-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Finnish-NLP/Ahma-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Finnish-NLP/Ahma-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Finnish-NLP/Ahma-7B with Docker Model Runner:
docker model run hf.co/Finnish-NLP/Ahma-7B
Commit History
Update README.md ee9147f verified
Update README.md fcb4832 verified
Upload ahma.jpg aa79db2 verified
Add 2-stage model ba60e40
aapotanskanen commited on
fix autotokenizer f2e7d96 verified
Update README.md dfc3629 verified
Add chat template tokenizer 7904bf0 verified
Update optimizers 947b4f4
aapot commited on
Add 900k step model 5f6eb94
aapot commited on
Add 800k step model bc2d607
aapot commited on
Add 700k step model 69352a4
aapot commited on
Add 600k step model 760ab5f
aapot commited on
Add 500k step model 32ea3d3
aapot commited on
Add 400k step model d1256a6
aapot commited on
Add 300k step model 1e3094c
aapot commited on
Update README.md f4d1ca9 verified
Add 200k model a4afd13
aapot commited on
fix daa4cb1
aapot commited on
fix d2f0aae
aapot commited on
Update train script 6c26cc3
aapot commited on
Add 100k step model 245fad2
aapot commited on
Add 50k model 6d1b645
aapot commited on
Add training codes a85f909
aapot commited on