jjae commited on
Commit
636a300
·
verified ·
1 Parent(s): d448db9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -1
README.md CHANGED
@@ -7,4 +7,44 @@ base_model:
7
  tags:
8
  - Korean
9
  - Culture
10
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  tags:
8
  - Korean
9
  - Culture
10
+ ---
11
+
12
+
13
+ # Midm-KCulture-2.0-Base-Instruct
14
+ - This model is fine-tuned from KT/Midm-2.0-Base-Instruct on the 'Korean Culture Q&A Corpus' using the LoRA (Low-Rank Adaptation) methodology.
15
+
16
+ ## Training Hyperparameters
17
+
18
+ | Hyperparameter | Value |
19
+ | :---------------------------- | :---------------------------- |
20
+ | **SFTConfig** | |
21
+ | `torch_dtype` | `bfloat16` |
22
+ | `seed` | `42` |
23
+ | `epoch` | `3` |
24
+ | `per_device_train_batch_size` | `2` |
25
+ | `per_device_eval_batch_size` | `2` |
26
+ | `learning_rate` | `0.0002` |
27
+ | `lr_scheduler_type` | `"linear"` |
28
+ | `max_grad_norm` | `1.0` |
29
+ | `neftune_noise_alpha` | `None` |
30
+ | `gradient_accumulation_steps` | `1` |
31
+ | `gradient_checkpointing` | `False` |
32
+ | `max_seq_length` | `1024` |
33
+ | **LoraConfig** | |
34
+ | `r` | `16` |
35
+ | `lora_alpha` | `16` |
36
+ | `lora_dropout` | `0.1` |
37
+ | `target_modules` | `["q_proj", "v_proj"]` |
38
+
39
+ ## Usage
40
+ ```python
41
+ from transformers import AutoModelForCausalLM, AutoTokenizer
42
+ model_name = "jjae/Midm-KCulture-2.0-Base-Instruct"
43
+ model = AutoModelForCausalLM.from_pretrained(
44
+ model_name,
45
+ torch_dtype=torch.bfloat16,
46
+ trust_remote_code=True,
47
+ device_map="auto")
48
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
49
+ ```
50
+