Instructions to use WestCode1357/gpt-sw3-126m-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use WestCode1357/gpt-sw3-126m-instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="WestCode1357/gpt-sw3-126m-instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("WestCode1357/gpt-sw3-126m-instruct") model = AutoModelForCausalLM.from_pretrained("WestCode1357/gpt-sw3-126m-instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use WestCode1357/gpt-sw3-126m-instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "WestCode1357/gpt-sw3-126m-instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "WestCode1357/gpt-sw3-126m-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/WestCode1357/gpt-sw3-126m-instruct
- SGLang
How to use WestCode1357/gpt-sw3-126m-instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "WestCode1357/gpt-sw3-126m-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "WestCode1357/gpt-sw3-126m-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "WestCode1357/gpt-sw3-126m-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "WestCode1357/gpt-sw3-126m-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use WestCode1357/gpt-sw3-126m-instruct with Docker Model Runner:
docker model run hf.co/WestCode1357/gpt-sw3-126m-instruct
gpt-sw3-126m-instruct
Smallest GPT-SW3 instruct model (126M parameters). Loads instantly — ideal for testing and prototyping.
Size: 126M | Type: instruct | Languages: Swedish, Norwegian, Danish, Icelandic, English
Community mirror of AI-Sweden-Models/gpt-sw3-126m-instruct
Warning and Disclaimer
This model is provided as-is for research and educational purposes. Community redistribution of AI Sweden's GPT-SW3 under the same modified RAIL license.
You are responsible for any content you create using this model. Use responsibly.
The model may reflect biases from training data and may generate inaccurate, offensive, or inappropriate content. Neither the uploader nor AI Sweden are liable for downstream misuse. Review the AI Sweden RAIL license before any production deployment.
"You are responsible for any content you create using this model. Enjoy responsibly."
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "WestCode1357/gpt-sw3-126m-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16)
device = "mps" if torch.backends.mps.is_available() else "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
prompt = "Träd är fina för att"
inputs = tokenizer(prompt, return_tensors="pt").to(device)
out = model.generate(**inputs, max_new_tokens=150, do_sample=True, temperature=0.7)
print(tokenizer.decode(out[0]))
Chat / instruct format
GPT-SW3 instruct uses special tokens. The format is:
<|endoftext|><s>User: [your message]<s>Bot: [response]<s>...
eos = "<|endoftext|>"
seg = "<s>"
prompt = f"{eos}{seg}User: Vad är huvudstaden i Sverige?{seg}Bot: "
inputs = tokenizer(prompt, return_tensors="pt").to(device)
out = model.generate(
**inputs, max_new_tokens=200,
do_sample=True, temperature=0.7, top_p=0.95,
eos_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(out[0][inputs.input_ids.shape[-1]:], skip_special_tokens=False))
Intended Use
⚠️ These models contain extreme bias and are NOT intended for commercial use. For scientific and research use only.
GPT-SW3 was trained on large-scale web data and may reflect harmful societal biases present in that data. It has not been aligned or safety-tuned beyond its original training. Use strictly in controlled research settings. Do not deploy in any consumer-facing or commercial product without thorough evaluation and additional safety measures.
About GPT-SW3
GPT-SW3 is developed by AI Sweden in collaboration with RISE and WASP WARA for Media and Language. Trained on 320B tokens: Swedish, Norwegian, Danish, Icelandic, English, and code.
- Original models: https://huggingface.co/AI-Sweden-Models
- Project page: https://www.ai.se/en/project/gpt-sw3
- Downloads last month
- 53