Update model card
Browse files
README.md
CHANGED
|
@@ -6,4 +6,83 @@ base_model:
|
|
| 6 |
- unsloth/Llama-3.2-3B-Instruct-bnb-4bit
|
| 7 |
pipeline_tag: text-generation
|
| 8 |
---
|
| 9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
- unsloth/Llama-3.2-3B-Instruct-bnb-4bit
|
| 7 |
pipeline_tag: text-generation
|
| 8 |
---
|
| 9 |
+
|
| 10 |
+
### Model Summary
|
| 11 |
+
|
| 12 |
+
This model is a **Korean instruction-following Small Language Model (SLM)** fine-tuned from the **Llama-3.2-3B base model** using **Supervised Fine-Tuning (SFT)**.
|
| 13 |
+
The objective of this model is to validate a **resource-efficient fine-tuning and deployment pipeline** suitable for **on-premise and constrained GPU/CPU environments**, rather than to maximize benchmark scores.
|
| 14 |
+
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
+
### Training Approach
|
| 18 |
+
|
| 19 |
+
* **Base Model**: Meta Llama-3.2-3B (base, non-instruct)
|
| 20 |
+
* **Fine-Tuning Method**: Supervised Fine-Tuning (SFT)
|
| 21 |
+
* **Parameter-Efficient Training**: LoRA (PEFT)
|
| 22 |
+
* **Quantization During Training**: 4-bit (QLoRA)
|
| 23 |
+
* **Training Framework**: Unsloth + Hugging Face TRL
|
| 24 |
+
* **Training Environment**: Single GPU (Google Colab, Tesla T4)
|
| 25 |
+
|
| 26 |
+
The model was trained using an **instruction–response prompt template (Alpaca-style)**, enabling stable instruction-following behavior in Korean.
|
| 27 |
+
The fine-tuning process focused on **maintaining the base model’s general language capability while adapting response style, tone, and instruction compliance**.
|
| 28 |
+
|
| 29 |
+
---
|
| 30 |
+
|
| 31 |
+
### Dataset
|
| 32 |
+
|
| 33 |
+
* **Primary Dataset**: `korean_safe_conversation`
|
| 34 |
+
* **Language**: Korean
|
| 35 |
+
* **Data Type**: Instruction–response conversational data
|
| 36 |
+
* **Data Scale**: ~27K samples
|
| 37 |
+
|
| 38 |
+
The dataset was preprocessed to ensure:
|
| 39 |
+
|
| 40 |
+
* Clear separation between instruction and response
|
| 41 |
+
* Explicit end-of-sequence (EOS) control to prevent uncontrolled generation
|
| 42 |
+
* Consistent prompt formatting for stable training behavior
|
| 43 |
+
|
| 44 |
+
---
|
| 45 |
+
|
| 46 |
+
### Intended Use
|
| 47 |
+
|
| 48 |
+
This model is intended for:
|
| 49 |
+
|
| 50 |
+
* Korean instruction-following assistants
|
| 51 |
+
* Domain-adapted SLM experimentation
|
| 52 |
+
* On-premise inference scenarios where:
|
| 53 |
+
|
| 54 |
+
* Data privacy is critical
|
| 55 |
+
* GPU resources are limited
|
| 56 |
+
* Low-latency local inference is preferred
|
| 57 |
+
|
| 58 |
+
Typical application examples include:
|
| 59 |
+
|
| 60 |
+
* Internal enterprise assistants
|
| 61 |
+
* Document-based Q&A systems (pre/post-RAG)
|
| 62 |
+
* Operational report generation from structured or semi-structured text
|
| 63 |
+
|
| 64 |
+
---
|
| 65 |
+
|
| 66 |
+
### Deployment
|
| 67 |
+
|
| 68 |
+
* **Format**: GGUF
|
| 69 |
+
* **Quantization**: Q8
|
| 70 |
+
* **Deployment Target**: CPU or low-VRAM environments
|
| 71 |
+
* **Distribution**: Hugging Face Hub
|
| 72 |
+
|
| 73 |
+
The GGUF format allows the model to be deployed **without external API dependencies**, making it suitable for **secure, offline, or air-gapped environments**.
|
| 74 |
+
|
| 75 |
+
---
|
| 76 |
+
|
| 77 |
+
### Limitations
|
| 78 |
+
|
| 79 |
+
* This model is **not an official Meta Instruct model**
|
| 80 |
+
* Preference optimization methods such as DPO or RLHF were not applied
|
| 81 |
+
* The model was trained for **behavior adaptation and stability**, not for benchmark optimization
|
| 82 |
+
* Performance may vary outside the instruction-following and conversational domains
|
| 83 |
+
|
| 84 |
+
---
|
| 85 |
+
|
| 86 |
+
### Technical Motivation
|
| 87 |
+
|
| 88 |
+
This project demonstrates that **domain-adapted instruction-following models can be efficiently built and deployed using small-scale resources**, providing a practical alternative to large, cost-intensive LLM deployments in real-world systems.
|