--- base_model: google/gemma-3-12b-it library_name: peft license: openrail pipeline_tag: text-generation tags: - base_model:adapter:google/gemma-3-12b-it - lora - sft - transformers - trl --- # Gemma3-MIAITS-Adapter **EN** | [LT](#lt-lietuvių) --- ## EN: English ### Overview **Gemma3-MIAITS-Adapter** is a LoRA adapter fine-tuned on top of [`google/gemma-3-12b-it`](https://huggingface.co/google/gemma-3-12b-it) for Lithuanian-language misinformation classification, developed as part of the **MIAITS** project (_Melagingos informacijos automatinio identifikavimo tekstyno sukūrimas_ - Lithuanian Misinformation Automatic Identification Text Corpus). The model classifies Lithuanian news articles and statements into three categories: | Label | Meaning | | --------------- | --------------------------------- | | `Klaidinga` | False / Fake information | | `Manipuliatyvu` | Manipulative / Misleading content | | `Teisinga` | True / Correct information | --- ### Architecture - **Base model**: [`google/gemma-3-12b-it`](https://huggingface.co/google/gemma-3-12b-it) - Gemma 3 12B instruction-tuned - **Adapter type**: LoRA (PEFT) via QLoRA (4-bit NF4) - **Task**: Causal language modelling (text generation) - classification via generated JSON response - **LoRA rank (r)**: 32 - **LoRA alpha**: 64 - **LoRA dropout**: 0.1 - **Target modules**: `q_proj`, `v_proj` (attention only) - **Bias**: none - **Quantization**: 4-bit NF4, compute dtype bfloat16, double quantization enabled - **PEFT version**: 0.18.1 --- ### Training Data **Source**: Lithuanian misinformation classification dataset. **Labels** (3-class): - `Klaidinga` - False - `Manipuliatyvu` - Manipulative - `Teisinga` - True **Text columns**: Each original row was expanded into 3 rows using `7-Statement`, `8-Statement_Context`, and `9-Full_text`. Validation and test sets use `9-Full_text` only. **Splits**: Stratified 80/10/10. | Split | Rows | | ----- | ------ | | Train | 11,976 | | Val | 499 | | Test | 499 | --- ### Training Hyperparameters | Parameter | Value | | --------------------- | --------------------------- | | Learning rate | 2e-5 | | Scheduler | Cosine (10% warmup) | | Weight decay | 0.05 | | Epochs | 3 (early stopping patience 2) | | Batch size | 1 | | Gradient accumulation | 32 (effective batch = 32) | | Max sequence length | 4,096 | | Precision | BF16 | | Max new tokens (eval) | 512 | **Selected checkpoint**: epoch 1 (best eval_loss). Runtime: ~11.6h. --- ### Prompt Format The system prompt instructs the model (in Lithuanian) to classify the text and respond in JSON: ```json {"label": "