pathcosmos commited on
Commit
b229e78
·
verified ·
1 Parent(s): 9d4531b

Upload sft-v2/README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. sft-v2/README.md +64 -0
sft-v2/README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ko
4
+ - en
5
+ license: apache-2.0
6
+ tags:
7
+ - sft
8
+ - instruction-tuned
9
+ - chat
10
+ - korean
11
+ - llm
12
+ pipeline_tag: text-generation
13
+ ---
14
+
15
+ # EVAFRILL-Mo 3B — SFT v2
16
+
17
+ Instruction-tuned variant of EVAFRILL-Mo 3B. Fine-tuned on Korean/English instruction data
18
+ with NEFTune noise augmentation.
19
+
20
+ ## Training Stage
21
+
22
+ Supervised Fine-Tuning (SFT) on top of the pretrained base checkpoint.
23
+
24
+ ## Key Details
25
+
26
+ - **Steps**: 65,000 (early stopped)
27
+ - **Stop criterion**: Validation loss plateau at 1.79
28
+ - **NEFTune alpha**: 5.0
29
+ - **Gradient Checkpointing**: enabled
30
+ - **Precision**: BF16
31
+
32
+ ## Metrics
33
+
34
+ | Metric | Value |
35
+ |--------|-------|
36
+ | Validation loss (final) | 1.79 |
37
+
38
+ ## Chat Template
39
+
40
+ ```
41
+ <|user|>
42
+ {user message}
43
+ <|assistant|>
44
+ {assistant response}
45
+ ```
46
+
47
+ ## Notes
48
+
49
+ This is the primary instruction-following checkpoint. It serves as the base for DPO rounds
50
+ and the SLERP merge. For best results with reduced repetition, consider using the
51
+ [SLERP variant](../slerp/) instead.
52
+
53
+ ## Main Model Card
54
+
55
+ See the [main README](../../README.md) for full project details, architecture, and training history.
56
+
57
+ ## Usage
58
+
59
+ ```python
60
+ from transformers import AutoModelForCausalLM, AutoTokenizer
61
+ model = AutoModelForCausalLM.from_pretrained("path/to/sft-v2", torch_dtype="bfloat16")
62
+ tokenizer = AutoTokenizer.from_pretrained("path/to/sft-v2")
63
+ inputs = tokenizer("<|user|>\n질문을 여기에 입력하세요\n<|assistant|>\n", return_tensors="pt")
64
+ ```