MagTina350m โ€” instruct (SFT + DPO + tool-use)

MagTina350m-instruct is the instruction-tuned, tool-using, DPO-aligned version of dataseek/magtina350m-base โ€” a 354.6 M-parameter Brazilian-Portuguese assistant trained from scratch by Dataseek under the Magestic.ai brand.

What's in this checkpoint

Training pipeline applied to the base model, in order:

  1. SFT v4 โ€” ~13 K synthetic Q&A pairs (Qwen3-distilled) + 505 hand-authored identity pairs
  2. DPO โ€” 1 312 preference pairs (Qwen3-judged) for politeness / refusal calibration
  3. Tool-use SFT โ€” 3 010 calc + now tool-call examples + 200 hardened NO_TOOL refusals

Final validation loss 0.79 ; identity probe 17/17 verbatim correct ; tool-use probe calc 7/10, now 5/5 ; NO_TOOL refusal 5/5.

Parameters 354,591,744
Architecture Llama2-mini โ†’ re-exported as LlamaForCausalLM
Context 2 048 tokens
Tools calc (safe-eval arithmetic), now (date/time with PT-BR fields)
License CC-BY-NC 4.0

Chat template

{% for message in messages -%}
{% if loop.first %}<|chat_bos|>{% endif -%}
{% if message['role'] == 'system' %}<|system|>{{ message['content'] | trim }}{% endif -%}
{% if message['role'] == 'user' %}<|user|>{{ message['content'] | trim }}{% endif -%}
{% if message['role'] == 'assistant' %}<|assistant|>{{ message['content'] | trim }}<|chat_eos|>{% endif -%}
{% endfor -%}
{% if add_generation_prompt %}<|assistant|>{% endif %}

The template is shipped in tokenizer_config.json and used automatically by tokenizer.apply_chat_template(...). Trained as single-turn โ€” multi-turn works in practice but was not in the SFT distribution.

Use โ€” pure chat (no tools)

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tok = AutoTokenizer.from_pretrained("dataseek/magtina350m-instruct")
model = AutoModelForCausalLM.from_pretrained(
    "dataseek/magtina350m-instruct", torch_dtype=torch.float16).to("cuda")

prompt = tok.apply_chat_template(
    [{"role": "user", "content": "Quem รฉ vocรช?"}],
    add_generation_prompt=True, return_tensors="pt").to("cuda")
out = model.generate(prompt, max_new_tokens=128, do_sample=True,
                     temperature=0.3, top_p=0.9, repetition_penalty=1.05,
                     eos_token_id=5)  # <|chat_eos|>
print(tok.decode(out[0, prompt.shape[1]:], skip_special_tokens=True))

Use โ€” with tools

generate() won't intercept tool calls on its own; use the runtime shipped in inference_with_tools.py:

from inference_with_tools import generate_with_tools
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tok = AutoTokenizer.from_pretrained("dataseek/magtina350m-instruct")
model = AutoModelForCausalLM.from_pretrained(
    "dataseek/magtina350m-instruct", torch_dtype=torch.float16).to("cuda")

print(generate_with_tools(model, tok, "Quanto รฉ 25 + 37?",
                          temperature=0.3, repetition_penalty=1.05))
# โ†’ "25 + 37 = 62."

Tool protocol

The model emits reserved tokens; the runtime intercepts them and injects the result back:

Token ID Role
<|tool_call|> 9 Model emits this, then a JSON object {"tool": "...", ...}
<|tool_result|> 10 Runtime emits this, then a JSON object with the result

Tool schemas:

{"tool": "calc", "expr": "25 + 37"}                          โ†’ {"result": 62}
{"tool": "now",  "timezone": "America/Sao_Paulo"}            โ†’ {"datetime": "...", "data": "11/05/2026",
                                                                "hora": "14:32", "weekday": "domingo", ...}

calc runs under an AST-whitelist sandbox (operators + the sqrt|abs|round|min|max|sum|log|log2|log10|exp|sin|cos|tan|ceil|floor math functions, constants pi/e). Attribute access, names, comprehensions, function defs, kwargs and any unlisted callable all raise.

Recommended decoding

Goal temperature top_p repetition_penalty
Factual / identity / tool-use 0.3 0.9 1.05
Creative writing 0.7โ€“0.9 0.9 1.10

EOS is <|chat_eos|> (id 5). PAD is <pad> (id 3).

Evaluation

Benchmark Score Notes
Identity probe (17 hand-authored Qs) 17 / 17 verbatim recall of seeded facts
Tool-use โ€” calc probe 7 / 10 failures on multi-step word problems
Tool-use โ€” now probe 5 / 5 weekday + Portuguese date
NO_TOOL refusal probe 5 / 5 refuses tool-call when no tool needed
ENEM-PT (5-option MCQ, 200 q) 25 % +5 pp over chance; +5 pp vs base
BPB-news vs base +0.022 well under +0.05 ship gate (no big regression)
Final SFT-stage val loss 0.79 down from 1.59 (SFT v2) โ†’ 0.97 (v4) โ†’ 0.79 (v5)

Identity facts baked in

The model knows the following about itself and will recall them verbatim:

  • Created by Dataseek (dataseek.com.br) under the Magestic.ai brand
  • Lead: Ricardo Frasson
  • 354.6 M parameters, Llama2-mini, 20 L ร— 16 H ร— 1024 d, vocab 40 K, context 2 048
  • Pretrained on 17.39 B PT-BR tokens over 15.77 h on 2 ร— H200 SXM
  • Cost US$ 126.47 / R$ 632.35 ; energy ~23 kWh, ~5.7 kg COโ‚‚eq

Intended use & limitations

Intended use. Brazilian-Portuguese chat assistant for research, demos, and derivative work. Useful as a small-footprint reasoner with tool-augmented arithmetic and date queries.

Limitations

  • General world knowledge is fragile. The model hallucinates on facts not in its identity corpus.
  • Math without a tool is unreliable; math with the calc tool is correct.
  • Brazilian geography knowledge ~50 % reliable; specifics often invented.
  • Monolingual โ€” PT-BR only. English/Spanish responses degrade fast.
  • Single-turn trained โ€” long multi-turn conversations may lose persona coherence.
  • No safety RLHF beyond DPO โ€” adversarial prompts may elicit unhelpful or biased output. Don't deploy to end-users without an additional safety filter.

Files in this repo

File Purpose
config.json LlamaConfig
model.safetensors FP16 weights (~709 MB)
generation_config.json Default decoding (temp 0.3, top_p 0.9, rep_pen 1.05)
tokenizer.json v3 BPE, 40 K vocab, 28 special tokens
tokenizer_config.json chat template, special-token map
special_tokens_map.json BOS/EOS/PAD/UNK + all 28 added tokens
inference_with_tools.py Tool-calling generation loop (calc + now)
_conversion_report.json Top-1 agreement diff vs original Mag350m

Citation

@misc{magtina350m2026,
  author = {Frasson, Ricardo and {Dataseek Team}},
  title  = {MagTina350m: A 354 M-parameter Brazilian Portuguese instruction-tuned language model},
  year   = {2026},
  publisher = {Hugging Face},
  url    = {https://huggingface.co/dataseek/magtina350m-instruct}
}

License

CC-BY-NC 4.0 โ€” free for research and non-commercial derivative work; commercial use requires written permission from Dataseek (contact via dataseek.com.br).

Downloads last month
42
Safetensors
Model size
0.4B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support