Skylar-236M-Chat

Skylar is a from-scratch decoder-only LLM (Qwen3-style: RMSNorm · RoPE · GQA · QK-Norm · SwiGLU · µP), pretrained on 1.12B tokens of Italian legal/normative text (EUR-Lex, Banca d'Italia, Gazzetta Ufficiale, normattiva, TED) on a single RTX 4090. Code & full results: https://github.com/2sophia/skylar.

Instruction-tuned (ChatML, assistant-only loss) on top of Skylar-236M-Base for grounded Italian RAG: answer-from-context, classify, extract-to-JSON, query generation, refuse-when-absent.

Honest scope — read before using. This is a 236M domain model, intentionally small and undertrained by Chinchilla (~1.12B tokens ≈ 5× below compute-optimal for this size). It is an Italian specialist (English is not fluent) and it is not a factual oracle — it hallucinates open-domain facts. Its real, measured strength is grounded Italian tasks (answer/extract/classify from provided context). Published as a transparent reference for the Skylar framework, not as a general-purpose or SOTA model.

What it does well (measured)

eval/validate_grounded.py — 6/6 with clean stopping:

Task	Output
Answer-from-context	✅ correct
Classify → one word	✅ `credito`
Extract → JSON	✅ `{"importo": "1.500 euro", "scadenza": "31 dicembre 2024"}`
Query generation (RAG)	✅ 3 clean search queries
Refuse when not in context	✅ "Non presente nel contesto"

Limits: fluent but hallucinates open-domain facts (use it grounded, with retrieved context); English is non-functional (Italian-only).

Architecture

Field	Value
Params	~236M
Layers	18
d_model	1024
Heads (Q/KV)	16 / 4 (GQA)
d_ff	2816 (SwiGLU)
Context	2048
Vocab	32,768 (ByteLevel BPE)
Pos. enc.	RoPE (θ=1e6) · QK-Norm · RMSNorm
License	Apache-2.0

Usage — ChatML

# pip install git+https://github.com/2sophia/skylar.git
from huggingface_hub import hf_hub_download
from tokenizers import Tokenizer
from models.decoder import NanoTransformer

model = NanoTransformer.from_pretrained("Sophia-AI/Skylar-236M-Chat").eval()
tok = Tokenizer.from_file(hf_hub_download("Sophia-AI/Skylar-236M-Chat", "tokenizer.json"))

prompt = (
    "<|im_start|>system\nSei un assistente che risponde SOLO dal contesto.<|im_end|>\n"
    "<|im_start|>user\nContesto: La Banca d'Italia ha sede a Roma.\n\nDomanda: Dove ha sede?<|im_end|>\n"
    "<|im_start|>assistant\n"
)
import torch
ids = torch.tensor([tok.encode(prompt, add_special_tokens=False).ids])
out = model.generate(ids, max_new_tokens=80, temperature=0.3,
                     eos_token_id=[tok.token_to_id("<|im_end|>")])
print(tok.decode(out[0].tolist()))

Downloads last month: -

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for Sophia-AI/Skylar-236M-Chat

Base model

Sophia-AI/Skylar-236M-Base

Finetuned

(2)

this model

Collection including Sophia-AI/Skylar-236M-Chat

Skylar

Collection

From-scratch Italian legal LLM stack: one 236M base -> chat + dense retrieval. Apache-2.0, local. github.com/2sophia/skylar • 3 items • Updated about 13 hours ago