PINDARO HF (General)

PINDARO HF is the Hugging Face-format release of the general-purpose Pindaro model.

Model At A Glance

  • Architecture: LlamaForCausalLM
  • Model type: llama
  • Approx. parameters: ~1.1B
  • Precision: float16
  • Context length: 2048
  • Vocabulary size: 32002
  • Languages: Italian, English
  • Primary use: general assistant text generation

Included Files (HF)

  • model.safetensors
  • config.json
  • generation_config.json
  • tokenizer.json
  • tokenizer.model
  • tokenizer_config.json
  • special_tokens_map.json
  • added_tokens.json

This repository is HF-only. GGUF artifacts are intentionally not included here.

Prompt Format

The tokenizer uses Noesis-style control tokens:

  • <|noesis|> (id 32000)
  • <|end|> (id 32001)

Configured template behavior is based on:

{% for message in messages %}<|noesis|>
### Domanda
{{ message['content'] }}

### Risposta
{% endfor %}

A stable manual prompt pattern is:

<|noesis|>
### Domanda
Spiega cos'e una funzione in Python.

### Risposta

Quickstart (Transformers)

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "RthItalia/PINDARO-HF"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
)

prompt = "<|noesis|>
### Domanda
Spiega cos'e una funzione in Python.

### Risposta
"

inputs = tokenizer(prompt, return_tensors="pt")

# pad_token_id == eos_token_id for this model: pass attention_mask explicitly.
outputs = model.generate(
    **inputs,
    attention_mask=inputs["attention_mask"],
    max_new_tokens=120,
    do_sample=False,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=False))

Validation Snapshot

Last internal validation snapshot: 2026-03-02

  • HF load/config/tokenizer/model smoke tests: PASS
  • Internal mini-eval (5 prompts, general quality gate): 1.00

Notes:

  • This is an internal sanity check, not a public benchmark suite.
  • Separate GGUF quality gating is tracked outside this HF-only repo.

Known Limitations

  • Outputs can become repetitive on some long generations.
  • As with other LLMs, factual and reasoning errors are possible.
  • Use additional validation for high-stakes or production workflows.

Safety

  • Do not use as sole source for legal, medical, or financial decisions.
  • Add moderation, logging, and domain-specific safeguards in downstream apps.

Artifact Checksums (SHA256)

  • model.safetensors: 778e5547c238d0e19738479562cdc310a38f5ee4c5354294a23dfccc92626e87
  • config.json: ae832c409e0d6ad9c8881ec2bd287a8d7e7e9012b712513532cd3ad352ca0655
  • generation_config.json: 6ff47e725c0ec6d0f1895670de7ee68e61a4f99703f6c8e89aea6ab14ea02dc3
  • tokenizer.json: 51433f06369ac3e597dfa23a811215e3511b8f86588a830ded72344b76a193ee
  • tokenizer.model: 9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
  • tokenizer_config.json: 02ca6d3ddfa1112eec7bd5f22a0e682338b5b2da8ddb6761e9d25e6d7b8188d0
  • special_tokens_map.json: d7805e093432afcde852968cdeba3de08a6fe66e77609f4701decb87fc492f33
  • added_tokens.json: ece349d292e246eac9a9072c1730f023e61567984a828fb0d25dccb14e3b7592
Downloads last month
15
Safetensors
Model size
1B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support