MagTina350m โ instruct (SFT + DPO + tool-use)
MagTina350m-instruct is the instruction-tuned, tool-using, DPO-aligned version of
dataseek/magtina350m-base โ a
354.6 M-parameter Brazilian-Portuguese assistant trained from scratch by
Dataseek under the Magestic.ai brand.
What's in this checkpoint
Training pipeline applied to the base model, in order:
- SFT v4 โ ~13 K synthetic Q&A pairs (Qwen3-distilled) + 505 hand-authored identity pairs
- DPO โ 1 312 preference pairs (Qwen3-judged) for politeness / refusal calibration
- Tool-use SFT โ 3 010 calc + now tool-call examples + 200 hardened NO_TOOL refusals
Final validation loss 0.79 ; identity probe 17/17 verbatim correct ; tool-use probe calc 7/10, now 5/5 ; NO_TOOL refusal 5/5.
| Parameters | 354,591,744 |
| Architecture | Llama2-mini โ re-exported as LlamaForCausalLM |
| Context | 2 048 tokens |
| Tools | calc (safe-eval arithmetic), now (date/time with PT-BR fields) |
| License | CC-BY-NC 4.0 |
Chat template
{% for message in messages -%}
{% if loop.first %}<|chat_bos|>{% endif -%}
{% if message['role'] == 'system' %}<|system|>{{ message['content'] | trim }}{% endif -%}
{% if message['role'] == 'user' %}<|user|>{{ message['content'] | trim }}{% endif -%}
{% if message['role'] == 'assistant' %}<|assistant|>{{ message['content'] | trim }}<|chat_eos|>{% endif -%}
{% endfor -%}
{% if add_generation_prompt %}<|assistant|>{% endif %}
The template is shipped in tokenizer_config.json and used automatically by
tokenizer.apply_chat_template(...). Trained as single-turn โ multi-turn works in
practice but was not in the SFT distribution.
Use โ pure chat (no tools)
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tok = AutoTokenizer.from_pretrained("dataseek/magtina350m-instruct")
model = AutoModelForCausalLM.from_pretrained(
"dataseek/magtina350m-instruct", torch_dtype=torch.float16).to("cuda")
prompt = tok.apply_chat_template(
[{"role": "user", "content": "Quem รฉ vocรช?"}],
add_generation_prompt=True, return_tensors="pt").to("cuda")
out = model.generate(prompt, max_new_tokens=128, do_sample=True,
temperature=0.3, top_p=0.9, repetition_penalty=1.05,
eos_token_id=5) # <|chat_eos|>
print(tok.decode(out[0, prompt.shape[1]:], skip_special_tokens=True))
Use โ with tools
generate() won't intercept tool calls on its own; use the runtime shipped in
inference_with_tools.py:
from inference_with_tools import generate_with_tools
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tok = AutoTokenizer.from_pretrained("dataseek/magtina350m-instruct")
model = AutoModelForCausalLM.from_pretrained(
"dataseek/magtina350m-instruct", torch_dtype=torch.float16).to("cuda")
print(generate_with_tools(model, tok, "Quanto รฉ 25 + 37?",
temperature=0.3, repetition_penalty=1.05))
# โ "25 + 37 = 62."
Tool protocol
The model emits reserved tokens; the runtime intercepts them and injects the result back:
| Token | ID | Role |
|---|---|---|
<|tool_call|> |
9 | Model emits this, then a JSON object {"tool": "...", ...} |
<|tool_result|> |
10 | Runtime emits this, then a JSON object with the result |
Tool schemas:
{"tool": "calc", "expr": "25 + 37"} โ {"result": 62}
{"tool": "now", "timezone": "America/Sao_Paulo"} โ {"datetime": "...", "data": "11/05/2026",
"hora": "14:32", "weekday": "domingo", ...}
calc runs under an AST-whitelist sandbox (operators + the sqrt|abs|round|min|max|sum|log|log2|log10|exp|sin|cos|tan|ceil|floor math functions, constants pi/e). Attribute access, names, comprehensions, function defs, kwargs and any unlisted callable all raise.
Recommended decoding
| Goal | temperature | top_p | repetition_penalty |
|---|---|---|---|
| Factual / identity / tool-use | 0.3 | 0.9 | 1.05 |
| Creative writing | 0.7โ0.9 | 0.9 | 1.10 |
EOS is <|chat_eos|> (id 5). PAD is <pad> (id 3).
Evaluation
| Benchmark | Score | Notes |
|---|---|---|
| Identity probe (17 hand-authored Qs) | 17 / 17 | verbatim recall of seeded facts |
Tool-use โ calc probe |
7 / 10 | failures on multi-step word problems |
Tool-use โ now probe |
5 / 5 | weekday + Portuguese date |
| NO_TOOL refusal probe | 5 / 5 | refuses tool-call when no tool needed |
| ENEM-PT (5-option MCQ, 200 q) | 25 % | +5 pp over chance; +5 pp vs base |
| BPB-news vs base | +0.022 | well under +0.05 ship gate (no big regression) |
| Final SFT-stage val loss | 0.79 | down from 1.59 (SFT v2) โ 0.97 (v4) โ 0.79 (v5) |
Identity facts baked in
The model knows the following about itself and will recall them verbatim:
- Created by Dataseek (dataseek.com.br) under the Magestic.ai brand
- Lead: Ricardo Frasson
- 354.6 M parameters, Llama2-mini, 20 L ร 16 H ร 1024 d, vocab 40 K, context 2 048
- Pretrained on 17.39 B PT-BR tokens over 15.77 h on 2 ร H200 SXM
- Cost US$ 126.47 / R$ 632.35 ; energy ~23 kWh, ~5.7 kg COโeq
Intended use & limitations
Intended use. Brazilian-Portuguese chat assistant for research, demos, and derivative work. Useful as a small-footprint reasoner with tool-augmented arithmetic and date queries.
Limitations
- General world knowledge is fragile. The model hallucinates on facts not in its identity corpus.
- Math without a tool is unreliable; math with the
calctool is correct. - Brazilian geography knowledge ~50 % reliable; specifics often invented.
- Monolingual โ PT-BR only. English/Spanish responses degrade fast.
- Single-turn trained โ long multi-turn conversations may lose persona coherence.
- No safety RLHF beyond DPO โ adversarial prompts may elicit unhelpful or biased output. Don't deploy to end-users without an additional safety filter.
Files in this repo
| File | Purpose |
|---|---|
config.json |
LlamaConfig |
model.safetensors |
FP16 weights (~709 MB) |
generation_config.json |
Default decoding (temp 0.3, top_p 0.9, rep_pen 1.05) |
tokenizer.json |
v3 BPE, 40 K vocab, 28 special tokens |
tokenizer_config.json |
chat template, special-token map |
special_tokens_map.json |
BOS/EOS/PAD/UNK + all 28 added tokens |
inference_with_tools.py |
Tool-calling generation loop (calc + now) |
_conversion_report.json |
Top-1 agreement diff vs original Mag350m |
Citation
@misc{magtina350m2026,
author = {Frasson, Ricardo and {Dataseek Team}},
title = {MagTina350m: A 354 M-parameter Brazilian Portuguese instruction-tuned language model},
year = {2026},
publisher = {Hugging Face},
url = {https://huggingface.co/dataseek/magtina350m-instruct}
}
License
CC-BY-NC 4.0 โ free for research and non-commercial derivative work; commercial use requires written permission from Dataseek (contact via dataseek.com.br).
- Downloads last month
- 42