Skylar-236M-Embed

Skylar is a from-scratch decoder-only LLM (Qwen3-style: RMSNorm · RoPE · GQA · QK-Norm · SwiGLU · µP), pretrained on 1.12B tokens of Italian legal/normative text (EUR-Lex, Banca d'Italia, Gazzetta Ufficiale, normattiva, TED) on a single RTX 4090. Code & full results: https://github.com/2sophia/skylar.

A dense retrieval embedder: the Skylar base run bidirectionally + mean-pool + L2-norm, contrastively fine-tuned (InfoNCE) on Italian QA. Same 236M backbone as the generative model — one base, many products.

Results vs off-the-shelf SOTA (measured, same pool & metrics)

eval/bench_retrieval.py on SQuAD-it test (7609 queries / 1988 contexts). Skylar is fine-tuned on Italian QA; bge-m3 / e5 are zero-shot:

Model	Params	R@1	R@5	nDCG@10
Skylar-236M-Embed	236M	0.55	0.81	0.71
`intfloat/multilingual-e5-base`	278M	0.71	0.91	0.83
`BAAI/bge-m3`	568M	0.70	0.90	0.83

A from-scratch 236M Italian embedder reaching ~86% of bge-m3's nDCG at 2.4× fewer parameters, fully local. It does not beat the multilingual SOTA on accuracy — its edge is size, locality and being part of a one-base gen+retrieval stack. In-domain (banking) it reaches R@1 0.93.

Architecture

Field	Value
Params	~236M
Layers	18
d_model	1024
Heads (Q/KV)	16 / 4 (GQA)
d_ff	2816 (SwiGLU)
Context	2048
Vocab	32,768 (ByteLevel BPE)
Pos. enc.	RoPE (θ=1e6) · QK-Norm · RMSNorm
License	Apache-2.0

Usage

# pip install git+https://github.com/2sophia/skylar.git
from huggingface_hub import hf_hub_download
from tokenizers import Tokenizer
from models.embedder import SkylarEmbedder
import torch, torch.nn.functional as F

model = SkylarEmbedder.from_pretrained("Sophia-AI/Skylar-236M-Embed").eval()
tok = Tokenizer.from_file(hf_hub_download("Sophia-AI/Skylar-236M-Embed", "tokenizer.json"))

def embed(texts):
    ids = [tok.encode(t, add_special_tokens=False).ids[:256] for t in texts]
    m = max(len(x) for x in ids)
    x = torch.zeros(len(ids), m, dtype=torch.long); a = torch.zeros(len(ids), m, dtype=torch.long)
    for i, t in enumerate(ids):
        x[i, :len(t)] = torch.tensor(t); a[i, :len(t)] = 1
    with torch.no_grad():
        return model(x, attention_mask=a)["embeddings"]   # L2-normalized

q = embed(["Cos'è un bonifico SEPA?"])
docs = embed(["Il bonifico SEPA trasferisce euro nell'area unica dei pagamenti.",
              "Il mutuo è un finanziamento per l'acquisto di un immobile."])
print((q @ docs.T).tolist())   # cosine similarity → first doc wins

Downloads last month: -

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for Sophia-AI/Skylar-236M-Embed

Base model

Sophia-AI/Skylar-236M-Base

Finetuned

(2)

this model

Collection including Sophia-AI/Skylar-236M-Embed

Skylar

Collection

From-scratch Italian legal LLM stack: one 236M base -> chat + dense retrieval. Apache-2.0, local. github.com/2sophia/skylar • 3 items • Updated about 9 hours ago