Upload sft-v2/README.md with huggingface_hub

Files changed (1) hide show

sft-v2/README.md ADDED Viewed

+---
+language:
+  - ko
+  - en
+license: apache-2.0
+tags:
+  - sft
+  - instruction-tuned
+  - chat
+  - korean
+  - llm
+pipeline_tag: text-generation
+---
+# EVAFRILL-Mo 3B — SFT v2
+Instruction-tuned variant of EVAFRILL-Mo 3B. Fine-tuned on Korean/English instruction data
+with NEFTune noise augmentation.
+## Training Stage
+Supervised Fine-Tuning (SFT) on top of the pretrained base checkpoint.
+## Key Details
+- **Steps**: 65,000 (early stopped)
+- **Stop criterion**: Validation loss plateau at 1.79
+- **NEFTune alpha**: 5.0
+- **Gradient Checkpointing**: enabled
+- **Precision**: BF16
+## Metrics
+| Metric | Value |
+|--------|-------|
+| Validation loss (final) | 1.79 |
+## Chat Template
+```
+<|user|>
+{user message}
+<|assistant|>
+{assistant response}
+```
+## Notes
+This is the primary instruction-following checkpoint. It serves as the base for DPO rounds
+and the SLERP merge. For best results with reduced repetition, consider using the
+[SLERP variant](../slerp/) instead.
+## Main Model Card
+See the [main README](../../README.md) for full project details, architecture, and training history.
+## Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("path/to/sft-v2", torch_dtype="bfloat16")
+tokenizer = AutoTokenizer.from_pretrained("path/to/sft-v2")
+inputs = tokenizer("<|user|>\n질문을 여기에 입력하세요\n<|assistant|>\n", return_tensors="pt")
+```