Text Generation
GGUF
llama

Model Summary

This model is a Korean instruction-following Small Language Model (SLM) fine-tuned from the Llama-3.2-3B base model using Supervised Fine-Tuning (SFT). The objective of this model is to validate a resource-efficient fine-tuning and deployment pipeline suitable for on-premise and constrained GPU/CPU environments, rather than to maximize benchmark scores.


Training Approach

  • Base Model: Meta Llama-3.2-3B (base, non-instruct)
  • Fine-Tuning Method: Supervised Fine-Tuning (SFT)
  • Parameter-Efficient Training: LoRA (PEFT)
  • Quantization During Training: 4-bit (QLoRA)
  • Training Framework: Unsloth + Hugging Face TRL
  • Training Environment: Single GPU (Google Colab, Tesla T4)

The model was trained using an instruction–response prompt template (Alpaca-style), enabling stable instruction-following behavior in Korean. The fine-tuning process focused on maintaining the base model’s general language capability while adapting response style, tone, and instruction compliance.


Dataset

  • Primary Dataset: korean_safe_conversation
  • Language: Korean
  • Data Type: Instruction–response conversational data
  • Data Scale: ~27K samples

The dataset was preprocessed to ensure:

  • Clear separation between instruction and response
  • Explicit end-of-sequence (EOS) control to prevent uncontrolled generation
  • Consistent prompt formatting for stable training behavior

Intended Use

This model is intended for:

  • Korean instruction-following assistants

  • Domain-adapted SLM experimentation

  • On-premise inference scenarios where:

    • Data privacy is critical
    • GPU resources are limited
    • Low-latency local inference is preferred

Typical application examples include:

  • Internal enterprise assistants
  • Document-based Q&A systems (pre/post-RAG)
  • Operational report generation from structured or semi-structured text

Deployment

  • Format: GGUF
  • Quantization: Q8
  • Deployment Target: CPU or low-VRAM environments
  • Distribution: Hugging Face Hub

The GGUF format allows the model to be deployed without external API dependencies, making it suitable for secure, offline, or air-gapped environments.


Limitations

  • This model is not an official Meta Instruct model
  • Preference optimization methods such as DPO or RLHF were not applied
  • The model was trained for behavior adaptation and stability, not for benchmark optimization
  • Performance may vary outside the instruction-following and conversational domains

Technical Motivation

This project demonstrates that domain-adapted instruction-following models can be efficiently built and deployed using small-scale resources, providing a practical alternative to large, cost-intensive LLM deployments in real-world systems.

Downloads last month
3
GGUF
Model size
3B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pelosi70/jch1

Dataset used to train pelosi70/jch1