Text Generation
GGUF
llama
pelosi70 commited on
Commit
9ffd08a
·
verified ·
1 Parent(s): b9634ea

Update model card

Browse files
Files changed (1) hide show
  1. README.md +80 -1
README.md CHANGED
@@ -6,4 +6,83 @@ base_model:
6
  - unsloth/Llama-3.2-3B-Instruct-bnb-4bit
7
  pipeline_tag: text-generation
8
  ---
9
- Korean instruction-following SLM fine-tuned from Llama-3.2-3B base using SFT and released in GGUF format for on-premise inference.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - unsloth/Llama-3.2-3B-Instruct-bnb-4bit
7
  pipeline_tag: text-generation
8
  ---
9
+
10
+ ### Model Summary
11
+
12
+ This model is a **Korean instruction-following Small Language Model (SLM)** fine-tuned from the **Llama-3.2-3B base model** using **Supervised Fine-Tuning (SFT)**.
13
+ The objective of this model is to validate a **resource-efficient fine-tuning and deployment pipeline** suitable for **on-premise and constrained GPU/CPU environments**, rather than to maximize benchmark scores.
14
+
15
+ ---
16
+
17
+ ### Training Approach
18
+
19
+ * **Base Model**: Meta Llama-3.2-3B (base, non-instruct)
20
+ * **Fine-Tuning Method**: Supervised Fine-Tuning (SFT)
21
+ * **Parameter-Efficient Training**: LoRA (PEFT)
22
+ * **Quantization During Training**: 4-bit (QLoRA)
23
+ * **Training Framework**: Unsloth + Hugging Face TRL
24
+ * **Training Environment**: Single GPU (Google Colab, Tesla T4)
25
+
26
+ The model was trained using an **instruction–response prompt template (Alpaca-style)**, enabling stable instruction-following behavior in Korean.
27
+ The fine-tuning process focused on **maintaining the base model’s general language capability while adapting response style, tone, and instruction compliance**.
28
+
29
+ ---
30
+
31
+ ### Dataset
32
+
33
+ * **Primary Dataset**: `korean_safe_conversation`
34
+ * **Language**: Korean
35
+ * **Data Type**: Instruction–response conversational data
36
+ * **Data Scale**: ~27K samples
37
+
38
+ The dataset was preprocessed to ensure:
39
+
40
+ * Clear separation between instruction and response
41
+ * Explicit end-of-sequence (EOS) control to prevent uncontrolled generation
42
+ * Consistent prompt formatting for stable training behavior
43
+
44
+ ---
45
+
46
+ ### Intended Use
47
+
48
+ This model is intended for:
49
+
50
+ * Korean instruction-following assistants
51
+ * Domain-adapted SLM experimentation
52
+ * On-premise inference scenarios where:
53
+
54
+ * Data privacy is critical
55
+ * GPU resources are limited
56
+ * Low-latency local inference is preferred
57
+
58
+ Typical application examples include:
59
+
60
+ * Internal enterprise assistants
61
+ * Document-based Q&A systems (pre/post-RAG)
62
+ * Operational report generation from structured or semi-structured text
63
+
64
+ ---
65
+
66
+ ### Deployment
67
+
68
+ * **Format**: GGUF
69
+ * **Quantization**: Q8
70
+ * **Deployment Target**: CPU or low-VRAM environments
71
+ * **Distribution**: Hugging Face Hub
72
+
73
+ The GGUF format allows the model to be deployed **without external API dependencies**, making it suitable for **secure, offline, or air-gapped environments**.
74
+
75
+ ---
76
+
77
+ ### Limitations
78
+
79
+ * This model is **not an official Meta Instruct model**
80
+ * Preference optimization methods such as DPO or RLHF were not applied
81
+ * The model was trained for **behavior adaptation and stability**, not for benchmark optimization
82
+ * Performance may vary outside the instruction-following and conversational domains
83
+
84
+ ---
85
+
86
+ ### Technical Motivation
87
+
88
+ This project demonstrates that **domain-adapted instruction-following models can be efficiently built and deployed using small-scale resources**, providing a practical alternative to large, cost-intensive LLM deployments in real-world systems.