pelosi70
/

jch1

Text Generation

GGUF

llama

Model card Files Files and versions

xet

Community

pelosi70 commited on Jan 30

Commit

9ffd08a

verified ·

1 Parent(s): b9634ea

Update model card

Browse files

Files changed (1) hide show

README.md +80 -1

README.md CHANGED Viewed

@@ -6,4 +6,83 @@ base_model:
 - unsloth/Llama-3.2-3B-Instruct-bnb-4bit
 pipeline_tag: text-generation
 ---
-Korean instruction-following SLM fine-tuned from Llama-3.2-3B base using SFT and released in GGUF format for on-premise inference.

 - unsloth/Llama-3.2-3B-Instruct-bnb-4bit
 pipeline_tag: text-generation
 ---
+### Model Summary
+This model is a **Korean instruction-following Small Language Model (SLM)** fine-tuned from the **Llama-3.2-3B base model** using **Supervised Fine-Tuning (SFT)**.
+The objective of this model is to validate a **resource-efficient fine-tuning and deployment pipeline** suitable for **on-premise and constrained GPU/CPU environments**, rather than to maximize benchmark scores.
+---
+### Training Approach
+* **Base Model**: Meta Llama-3.2-3B (base, non-instruct)
+* **Fine-Tuning Method**: Supervised Fine-Tuning (SFT)
+* **Parameter-Efficient Training**: LoRA (PEFT)
+* **Quantization During Training**: 4-bit (QLoRA)
+* **Training Framework**: Unsloth + Hugging Face TRL
+* **Training Environment**: Single GPU (Google Colab, Tesla T4)
+The model was trained using an **instruction–response prompt template (Alpaca-style)**, enabling stable instruction-following behavior in Korean.
+The fine-tuning process focused on **maintaining the base model’s general language capability while adapting response style, tone, and instruction compliance**.
+---
+### Dataset
+* **Primary Dataset**: `korean_safe_conversation`
+* **Language**: Korean
+* **Data Type**: Instruction–response conversational data
+* **Data Scale**: ~27K samples
+The dataset was preprocessed to ensure:
+* Clear separation between instruction and response
+* Explicit end-of-sequence (EOS) control to prevent uncontrolled generation
+* Consistent prompt formatting for stable training behavior
+---
+### Intended Use
+This model is intended for:
+* Korean instruction-following assistants
+* Domain-adapted SLM experimentation
+* On-premise inference scenarios where:
+  * Data privacy is critical
+  * GPU resources are limited
+  * Low-latency local inference is preferred
+Typical application examples include:
+* Internal enterprise assistants
+* Document-based Q&A systems (pre/post-RAG)
+* Operational report generation from structured or semi-structured text
+---
+### Deployment
+* **Format**: GGUF
+* **Quantization**: Q8
+* **Deployment Target**: CPU or low-VRAM environments
+* **Distribution**: Hugging Face Hub
+The GGUF format allows the model to be deployed **without external API dependencies**, making it suitable for **secure, offline, or air-gapped environments**.
+---
+### Limitations
+* This model is **not an official Meta Instruct model**
+* Preference optimization methods such as DPO or RLHF were not applied
+* The model was trained for **behavior adaptation and stability**, not for benchmark optimization
+* Performance may vary outside the instruction-following and conversational domains
+---
+### Technical Motivation
+This project demonstrates that **domain-adapted instruction-following models can be efficiently built and deployed using small-scale resources**, providing a practical alternative to large, cost-intensive LLM deployments in real-world systems.