gpt-sw3-1.3b-instruct

Balanced GPT-SW3 instruct model (1.3B parameters). Fast inference with good Swedish chat quality.

Size: 1.3B | Type: instruct | Languages: Swedish, Norwegian, Danish, Icelandic, English

Community mirror of AI-Sweden-Models/gpt-sw3-1.3b-instruct


Warning and Disclaimer

This model is provided as-is for research and educational purposes. Community redistribution of AI Sweden's GPT-SW3 under the same modified RAIL license.

You are responsible for any content you create using this model. Use responsibly.

The model may reflect biases from training data and may generate inaccurate, offensive, or inappropriate content. Neither the uploader nor AI Sweden are liable for downstream misuse. Review the AI Sweden RAIL license before any production deployment.

"You are responsible for any content you create using this model. Enjoy responsibly."


Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "WestCode1357/gpt-sw3-1.3b-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16)
device = "mps" if torch.backends.mps.is_available() else "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)

prompt = "Träd är fina för att"
inputs = tokenizer(prompt, return_tensors="pt").to(device)
out = model.generate(**inputs, max_new_tokens=150, do_sample=True, temperature=0.7)
print(tokenizer.decode(out[0]))

Chat / instruct format

GPT-SW3 instruct uses special tokens. The format is:

<|endoftext|><s>User: [your message]<s>Bot: [response]<s>...
eos = "<|endoftext|>"
seg = "<s>"
prompt = f"{eos}{seg}User: Vad är huvudstaden i Sverige?{seg}Bot: "
inputs = tokenizer(prompt, return_tensors="pt").to(device)
out = model.generate(
    **inputs, max_new_tokens=200,
    do_sample=True, temperature=0.7, top_p=0.95,
    eos_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(out[0][inputs.input_ids.shape[-1]:], skip_special_tokens=False))

Intended Use

⚠️ These models contain extreme bias and are NOT intended for commercial use. For scientific and research use only.

GPT-SW3 was trained on large-scale web data and may reflect harmful societal biases present in that data. It has not been aligned or safety-tuned beyond its original training. Use strictly in controlled research settings. Do not deploy in any consumer-facing or commercial product without thorough evaluation and additional safety measures.

About GPT-SW3

GPT-SW3 is developed by AI Sweden in collaboration with RISE and WASP WARA for Media and Language. Trained on 320B tokens: Swedish, Norwegian, Danish, Icelandic, English, and code.

Downloads last month
49
Safetensors
Model size
1B params
Tensor type
F32
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support