Qwen3-0.6B SR Instruct

🇷🇸 Srpski

Opis Modela

Ovaj model je specijalizovana verzija Qwen3-0.6B, adaptirana (fine-tuned) za srpski jezik. Model je prošao kroz dve faze treninga:

Osnovni model: Trening na visokokvalitetnim naučnim tekstovima.
Instruct model: Podešavanje za praćenje uputstava (ChatML format) koristeći specifične setove podataka.

Karakteristike

Veličina: 0.6 milijardi parametara (izuzetno brz na manjim karticama).
Format: ChatML (<|im_start|>, <|im_end|>).

Preporučeni parametri za Inference

Za najbolju gramatiku i logiku, preporučuje se Beam Search:

# num_beams=5, do_sample=False, no_repeat_ngram_size=3

🇬🇧 English

Model Description

Qwen3-0.6B-SR-Instruct- is a specialized, lightweight language model fine-tuned for the Serbian language.

The model underwent a two-stage training process:

Base Model: Training on academic and scientific corpora.
Instruction Tuning: Refined using specialized instruction sets in ChatML format to ensure professional and context-aware responses.

Key Features

Compact & Efficient: At 0.6B parameters, it offers high-speed inference even on consumer-grade GPUs.
Formatting: Uses ChatML template for clean conversational flow.

Usage & Inference Settings

To achieve optimal grammatical correctness in Serbian, we recommend using Beam Search over random sampling:

Beam Count: 5
Sampling: Disabled (do_sample=False)
Repetition Penalty: 1.1

📈 Training Progress

The model was trained using a structured Instruct Tuning approach. The chart below visualizes the loss reduction over 1500 steps, showing a smooth convergence to a final loss of 1.2655.

Training Loss: Consistent decline, indicating effective learning of the Serbian instruction set.
Final Loss: 1.2655 (indicates high confidence in domain-specific responses).
Hardware: Optimized for single GPU 24GB VRAM.

⚠️ Limitations (Ograničenja)

SR: S obzirom na veličinu od 0.6B parametara, model može imati sledeća ograničenja:

Halucinacije: Pri visokim temperaturama (sampling), model može generisati fiktivne podatke. Uvek koristiti Beam Search za kritične informacije.
Sugestivnost: Model teži da se složi sa korisnikom čak i ako je tvrdnja netačna (kao što smo videli u testu sa "spaljivanjem"). Koristite stroge System Prompte.

EN: Given the 0.6B parameter scale, the following limitations apply:

Hallucinations: High temperature settings may lead to factual errors. Use Beam Search for accuracy.
Compliance Bias: The model might follow incorrect user premises. Use strong System Instructions to anchor the model's logic.

🛠️ Quick Start / Kako koristiti

from transformers import pipeline
import torch

model_id = "tvoja-putanja/qwen3-0.6b-sr-instruct"

pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

prompt = "<|im_start|>user\nObjasni važnost digitalizacije arhiva.<|im_end|>\n<|im_start|>assistant\n"

output = pipe(prompt, max_new_tokens=300, num_beams=5, do_sample=False)
print(output[0]['generated_text'])