You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Vircava-3B-FP32

Vircava-3B-FP32 is a Latvian-language fine-tune of ibm-granite/granite-4.1-3b, trained on TitleOS/latvian_glaiveai_reasoning-v1_5k_subset — a Latvian-translated subset of the GlaiveAI reasoning-v1 dataset. It's designed to bring chain-of-thought reasoning and conversational fluency in Latvian to hardware that most people actually own: CPUs, integrated GPUs, and low-end discrete cards. If you can run a 3B model at all, you can run this one.

Vircava is the first model in a planned family targeting Latvian as a first-class language for both general reasoning and creative writing.


What it can do

  • Converse naturally in Latvian, including multi-turn dialogue
  • Produce structured chain-of-thought reasoning in Latvian before arriving at an answer
  • Use Granite's native tool-calling format, inherited from the base model and preserved through fine-tuning
  • Handle mixed Latvian/English prompts gracefully
  • Run entirely on CPU, making it usable without any GPU at all

Granite 4.1's tool-calling capabilities are part of the base model's instruction format and carry forward here. If you're building an agentic pipeline and want it to operate in Latvian, this is a reasonable starting point.


Intended hardware

This model is specifically sized and trained for accessibility. Target environments include:

  • CPU inference via llama.cpp or Ollama (recommended for most users)
  • Low-end consumer GPUs (4–8GB VRAM) with appropriate quantization (Q4_K_M or Q5_K_M recommended)
  • Integrated graphics with shared memory setups

For CPU and low-VRAM deployments, use a quantized GGUF version. The FP32 weights in this repository are the canonical release intended for re-quantization or for users who want to derive their own quantized artifacts.


Quickstart

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "TitleOS/Vircava-3B-FP32"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float32,
    device_map="cpu",  # or "auto" if you have a GPU
)

messages = [
    {
        "role": "user",
        "content": "Izskaidro, kāpēc debesis ir zilas. Domā soli pa solim."
    }
]

input_ids = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
)

output = model.generate(input_ids, max_new_tokens=512, do_sample=True, temperature=0.7)
print(tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))

Training details

Parameter Value
Base model ibm-granite/granite-4.1-3b
Training dataset TitleOS/latvian_glaiveai_reasoning-v1_5k_subset
Fine-tuning method LoRA (rsLoRA)
LoRA rank 32
LoRA alpha 64
rsLoRA scale ~11.3
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Epochs 1
Effective batch size 16
Learning rate 2e-4
LR scheduler Cosine
Max sequence length 2048
Precision FP32 (full, no quantization during training)
Hardware Tesla P40 (24GB)
Loss masking Completion-only (assistant turns only)

The dataset is a 5k-row Latvian translation of GlaiveAI's reasoning-v1 dataset, produced using Facebook's NLLB-200-3.3B translation model. The training mix also includes natural Latvian text from the RaivisDejus/latvian-text corpus to support general language fluency alongside structured reasoning.


Limitations

Vircava-3B-FP32 is an early-stage model. A few things to be realistic about:

  • 3B parameters is small. Reasoning depth and instruction-following are more limited than larger models. Complex multi-step problems may produce partially correct chains.
  • 5k training rows is a modest dataset. Latvian fluency is functional but not flawless. Unusual phrasings or domain-specific vocabulary may produce less natural output.
  • Tool calling is inherited, not extensively validated. The base model's tool-calling format carries through, but testing has been limited to standard conversational use.
  • This is not a safety-tuned model. It inherits Granite 4.1's base behavior. Do not deploy it in contexts requiring robust content filtering without additional alignment work.
  • English bleed is possible. On prompts that mix Latvian and English, the model may respond partially or fully in English, particularly for topics that appeared rarely in Latvian in the training data.

The Vircava family (planned)

Vircava-3B-FP32 is the first release. Two 27B models are in development:

Riga-27B

A larger version of this model, fine-tuned for Latvian reasoning and conversation at scale. Intended for GPU-equipped deployments at universities, research institutions, and other organizations with proper inference infrastructure. Based on a 27B foundation model, it will offer substantially deeper reasoning chains and more robust Latvian fluency than the 3B variant.

Vircava-Rakstnieks-27B ("Writer", Placeholder title)

A Latvian creative writing model fine-tuned on LatSenRom, the Corpus of Latvian Early Novels (1879–1940), available through the Latvian National Corpus Collection at korpuss.lv. The base model is google/gemma-3-27b-it. The goal is a model that writes in the style and register of early Latvian literary prose — a register that no general-purpose model currently handles well, and one with significant cultural and research value.

Both models will be released under the same license as this one when training is complete.


License

Vircava-3B-FP32 is released under a modified MPL-2.0 license that includes a Common Clause modification. This means you are free to use, study, modify, and redistribute the model for non-commercial purposes, but you may not sell the model or a product where the model itself is the primary commercial value without explicit written permission.

See LICENSE.md for the full license text and terms.


Citation

If you use Vircava-3B-FP32 in research or a project, a citation or mention is appreciated:

@misc{vircava3b2025,
  author = {TitleOS},
  title = {Vircava-3B-FP32: A Latvian Reasoning Model},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/TitleOS/Vircava-3B-FP32}
}

Acknowledgements

Downloads last month
2
Safetensors
Model size
3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TitleOS/Vircava-3B-FP32

Finetuned
(10)
this model

Dataset used to train TitleOS/Vircava-3B-FP32