--- language: vi tags: - vietnamese - causal-lm - finetuning - viena library_name: transformers pipeline_tag: text-generation license: other base_model: vietrix/viena-60m-pretrain --- # Viena 60M (SFT) ## Model details - Developed by: Vietrix - Model type: decoder-only causal LM (Llama-style) - Parameters: ~60M - Layers: 16 - Hidden size: 512 - Attention heads: 8 (KV heads: 4) - Max sequence length: 1024 - RoPE theta: 10000 - Normalization/MLP: RMSNorm + SwiGLU - Precision: BF16 training ## Tokenizer - SentencePiece BPE - Target vocab in config: 32k - Actual vocab in tokenizer.model: 2105 (trained on a small corpus) - Note: embeddings are sized for 32k; only the first 2105 tokens are used by the tokenizer. ## Training data - Internal synthetic Vietnamese instruction/chat data. - Train/val split: 2,000 / 200 JSONL records. - Format: messages with roles (system/user/assistant/tool). - PII: best-effort redaction applied during dataset preparation. ## Fine-tuning procedure - Initialized from: `vietrix/viena-60m-pretrain`. - Objective: token-level cross-entropy, prompt loss disabled. - Sequence length: 1024. - Global batch size: 32 (batch 8 x grad_accum 4). - Optimizer: AdamW, lr 2e-4, weight decay 0.01, cosine decay with warmup. - Steps: 1,000. - Validation every 200 steps (10 batches). ## Intended use - Vietnamese chat/instruction-following use cases. - Research and prototyping; not a production-grade safety model. ## Limitations - Trained on a small synthetic corpus; may hallucinate or respond incorrectly. - Not safety-tuned for sensitive domains. - Tokenizer vocab is small; lexical coverage is limited. ## How to use ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "vietrix/viena-60m" tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=False) model = AutoModelForCausalLM.from_pretrained(model_id) ``` If `AutoTokenizer` fails, load the SentencePiece model explicitly: ```python from transformers import LlamaTokenizer tokenizer = LlamaTokenizer.from_pretrained(model_id, use_fast=False) ```