ugaoo commited on
Commit
da3803b
·
verified ·
1 Parent(s): 821ba38

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -145
README.md CHANGED
@@ -1,145 +0,0 @@
1
- ---
2
- library_name: peft
3
- base_model: ugaoo/llama_85_8
4
- tags:
5
- - generated_from_trainer
6
- datasets:
7
- - ugaoo/medmcqa_trail_doc_anki
8
- model-index:
9
- - name: out/llama_85_8_medmcqa_trail_doc_anki
10
- results: []
11
- ---
12
-
13
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
- should probably proofread and complete it, then remove this comment. -->
15
-
16
- [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
17
- <details><summary>See axolotl config</summary>
18
-
19
- axolotl version: `0.8.0.dev0`
20
- ```yaml
21
- base_model: ugaoo/llama_85_8
22
- model_type: AutoModelForCausalLM
23
- tokenizer_type: AutoTokenizer
24
- trust_remote_code: true
25
-
26
- load_in_8bit: false
27
- load_in_4bit: true
28
- strict: false
29
-
30
- datasets:
31
- - path: ugaoo/medmcqa_trail_doc_anki
32
- type: alpaca
33
- val_set_size: 0
34
- output_dir: ./out/llama_85_8_medmcqa_trail_doc_anki
35
-
36
- sequence_len: 4000
37
- sample_packing: true
38
- pad_to_sequence_len: true
39
-
40
- adapter: qlora
41
- lora_r: 256
42
- lora_alpha: 512
43
- lora_dropout: 0.05
44
- lora_target_linear: true
45
- lora_target_modules:
46
- - q_proj
47
- - k_proj
48
- - v_proj
49
- - o_proj
50
- - up_proj
51
- - down_proj
52
- - gate_proj
53
- lora_modules_to_save:
54
- - embed_tokens
55
- - lm_head
56
-
57
- wandb_project: peftsearchllama
58
- wandb_entity:
59
- wandb_watch:
60
- wandb_name: llama_85_8_medmcqa_trail_doc_anki
61
- wandb_log_model:
62
-
63
- gradient_accumulation_steps: 3
64
- micro_batch_size: 2
65
- num_epochs: 3
66
- optimizer: adamw_torch
67
- lr_scheduler: cosine
68
- learning_rate: 5e-6
69
-
70
- train_on_inputs: false
71
- group_by_length: false
72
- bf16: auto
73
- fp16: false
74
- tf32: false
75
-
76
- gradient_checkpointing: true
77
- early_stopping_patience:
78
- resume_from_checkpoint:
79
- logging_steps: 1
80
- xformers_attention:
81
- flash_attention: true
82
-
83
- warmup_steps: 100
84
- evals_per_epoch: 6
85
- eval_table_size:
86
- saves_per_epoch: 1
87
- debug:
88
- deepspeed:
89
- weight_decay: 0.0
90
- fsdp:
91
- fsdp_config:
92
- save_total_limit: 6
93
- special_tokens:
94
- pad_token: <|end_of_text|>
95
-
96
- ```
97
-
98
- </details><br>
99
-
100
- # out/llama_85_8_medmcqa_trail_doc_anki
101
-
102
- This model is a fine-tuned version of [ugaoo/llama_85_8](https://huggingface.co/ugaoo/llama_85_8) on the ugaoo/medmcqa_trail_doc_anki dataset.
103
-
104
- ## Model description
105
-
106
- More information needed
107
-
108
- ## Intended uses & limitations
109
-
110
- More information needed
111
-
112
- ## Training and evaluation data
113
-
114
- More information needed
115
-
116
- ## Training procedure
117
-
118
- ### Training hyperparameters
119
-
120
- The following hyperparameters were used during training:
121
- - learning_rate: 5e-06
122
- - train_batch_size: 2
123
- - eval_batch_size: 2
124
- - seed: 42
125
- - distributed_type: multi-GPU
126
- - num_devices: 4
127
- - gradient_accumulation_steps: 3
128
- - total_train_batch_size: 24
129
- - total_eval_batch_size: 8
130
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
131
- - lr_scheduler_type: cosine
132
- - lr_scheduler_warmup_steps: 100
133
- - num_epochs: 3.0
134
-
135
- ### Training results
136
-
137
-
138
-
139
- ### Framework versions
140
-
141
- - PEFT 0.15.0
142
- - Transformers 4.49.0
143
- - Pytorch 2.5.1+cu124
144
- - Datasets 3.4.1
145
- - Tokenizers 0.21.1