Delta-Vector commited on
Commit
722142d
·
verified ·
1 Parent(s): 7697769

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +107 -101
README.md CHANGED
@@ -1,34 +1,111 @@
1
- ---
2
- library_name: transformers
3
- license: agpl-3.0
4
- base_model: Delta-Vector/Holland-4B-V1
5
- tags:
6
- - generated_from_trainer
7
- datasets:
8
- - NewEden/CivitAI-Prompts-Sharegpt
9
- model-index:
10
- - name: outputs/out2
11
- results: []
12
- ---
13
-
14
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
- should probably proofread and complete it, then remove this comment. -->
16
-
17
- [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
18
- <details><summary>See axolotl config</summary>
19
-
20
- axolotl version: `0.6.0`
21
- ```yaml
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  base_model: Delta-Vector/Holland-4B-V1
23
  model_type: AutoModelForCausalLM
24
  tokenizer_type: AutoTokenizer
25
-
26
  trust_remote_code: true
27
-
28
  load_in_8bit: false
29
  load_in_4bit: false
30
  strict: false
31
-
32
  datasets:
33
  - path: NewEden/CivitAI-SD-Prompts
34
  datasets:
@@ -40,7 +117,6 @@ datasets:
40
  message_field_role: from
41
  message_field_content: value
42
  train_on_eos: turn
43
-
44
  dataset_prepared_path:
45
  val_set_size: 0.02
46
  output_dir: ./outputs/out2
@@ -48,33 +124,28 @@ sequence_len: 8192
48
  sample_packing: true
49
  eval_sample_packing: false
50
  pad_to_sequence_len: true
51
-
52
  plugins:
53
  - axolotl.integrations.liger.LigerPlugin
54
  liger_rope: true
55
  liger_rms_norm: true
56
  liger_swiglu: true
57
  liger_fused_linear_cross_entropy: true
58
-
59
  wandb_project: SDprompter-final
60
  wandb_entity:
61
  wandb_watch:
62
  wandb_name: SDprompter-final
63
  wandb_log_model:
64
-
65
  gradient_accumulation_steps: 16
66
  micro_batch_size: 1
67
  num_epochs: 4
68
  optimizer: paged_adamw_8bit
69
  lr_scheduler: cosine
70
  learning_rate: 0.00001
71
-
72
  train_on_inputs: false
73
  group_by_length: false
74
  bf16: auto
75
  fp16:
76
  tf32: true
77
-
78
  gradient_checkpointing: true
79
  gradient_checkpointing_kwargs:
80
  use_reentrant: false
@@ -84,83 +155,18 @@ local_rank:
84
  logging_steps: 1
85
  xformers_attention:
86
  flash_attention: true
87
-
88
  warmup_ratio: 0.05
89
  evals_per_epoch: 4
90
  saves_per_epoch: 1
91
  debug:
92
  weight_decay: 0.01
93
-
94
  special_tokens:
95
  pad_token: <|finetune_right_pad_id|>
96
  eos_token: <|eot_id|>
97
-
98
  auto_resume_from_checkpoints: true
99
-
100
- ```
101
-
102
- </details><br>
103
-
104
- # outputs/out2
105
-
106
- This model is a fine-tuned version of [Delta-Vector/Holland-4B-V1](https://huggingface.co/Delta-Vector/Holland-4B-V1) on the NewEden/CivitAI-Prompts-Sharegpt dataset.
107
- It achieves the following results on the evaluation set:
108
- - Loss: 3.2782
109
-
110
- ## Model description
111
-
112
- More information needed
113
-
114
- ## Intended uses & limitations
115
-
116
- More information needed
117
-
118
- ## Training and evaluation data
119
-
120
- More information needed
121
-
122
- ## Training procedure
123
-
124
- ### Training hyperparameters
125
-
126
- The following hyperparameters were used during training:
127
- - learning_rate: 1e-05
128
- - train_batch_size: 1
129
- - eval_batch_size: 1
130
- - seed: 42
131
- - gradient_accumulation_steps: 16
132
- - total_train_batch_size: 16
133
- - optimizer: Use paged_adamw_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
134
- - lr_scheduler_type: cosine
135
- - lr_scheduler_warmup_steps: 4
136
- - num_epochs: 4
137
-
138
- ### Training results
139
-
140
- | Training Loss | Epoch | Step | Validation Loss |
141
- |:-------------:|:------:|:----:|:---------------:|
142
- | 3.3357 | 0.0416 | 1 | 4.2492 |
143
- | 2.9892 | 0.2494 | 6 | 3.6285 |
144
- | 2.7364 | 0.4987 | 12 | 3.4675 |
145
- | 2.7076 | 0.7481 | 18 | 3.3928 |
146
- | 2.757 | 0.9974 | 24 | 3.3484 |
147
- | 2.5801 | 1.2078 | 30 | 3.3286 |
148
- | 2.6156 | 1.4571 | 36 | 3.3111 |
149
- | 2.5308 | 1.7065 | 42 | 3.2999 |
150
- | 2.5481 | 1.9558 | 48 | 3.2880 |
151
- | 2.5773 | 2.1662 | 54 | 3.2840 |
152
- | 2.5269 | 2.4156 | 60 | 3.2822 |
153
- | 2.5418 | 2.6649 | 66 | 3.2806 |
154
- | 2.4584 | 2.9143 | 72 | 3.2791 |
155
- | 2.6515 | 3.1247 | 78 | 3.2789 |
156
- | 2.4883 | 3.3740 | 84 | 3.2785 |
157
- | 2.4193 | 3.6234 | 90 | 3.2787 |
158
- | 2.4337 | 3.8727 | 96 | 3.2782 |
159
-
160
-
161
- ### Framework versions
162
-
163
- - Transformers 4.47.1
164
- - Pytorch 2.5.1+cu124
165
- - Datasets 3.2.0
166
- - Tokenizers 0.21.0
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>Model README</title>
7
+ <style>
8
+ body {
9
+ background: linear-gradient(-45deg, #0a0a0a, #121212, #1a1a1a);
10
+ color: #E0E0E0;
11
+ font-family: 'Segoe UI', system-ui;
12
+ margin: 0;
13
+ padding: 20px;
14
+ min-height: 100vh;
15
+ animation: gradient 15s ease infinite;
16
+ background-size: 400% 400%;
17
+ text-align: center;
18
+ }
19
+ @keyframes gradient {
20
+ 0% { background-position: 0% 50%; }
21
+ 50% { background-position: 100% 50%; }
22
+ 100% { background-position: 0% 50%; }
23
+ }
24
+ .container {
25
+ max-width: 800px;
26
+ margin: auto;
27
+ }
28
+ .model-image {
29
+ width: 100%;
30
+ border-radius: 12px;
31
+ filter: drop-shadow(0 0 10px rgba(255, 255, 255, 0.1));
32
+ animation: float 6s ease-in-out infinite;
33
+ }
34
+ @keyframes float {
35
+ 0%, 100% { transform: translateY(0); }
36
+ 50% { transform: translateY(-20px); }
37
+ }
38
+ .box {
39
+ background: rgba(30, 30, 30, 0.9);
40
+ border-radius: 12px;
41
+ padding: 20px;
42
+ margin: 25px 0;
43
+ backdrop-filter: blur(10px);
44
+ border: 1px solid rgba(255, 255, 255, 0.1);
45
+ text-align: left;
46
+ }
47
+ h2 {
48
+ border-left: 4px solid #0ff;
49
+ padding-left: 15px;
50
+ margin: 0 0 15px 0;
51
+ background: linear-gradient(90deg, transparent, rgba(0, 255, 255, 0.1));
52
+ text-transform: uppercase;
53
+ letter-spacing: 2px;
54
+ color: #fff;
55
+ }
56
+ .yaml-content {
57
+ background: #191919;
58
+ border-radius: 8px;
59
+ padding: 10px;
60
+ margin-top: 10px;
61
+ font-family: monospace;
62
+ white-space: pre-wrap;
63
+ color: #E0E0E0;
64
+ border-left: 4px solid #0ff;
65
+ }
66
+ /* Custom Scrollbar */
67
+ ::-webkit-scrollbar { width: 8px; }
68
+ ::-webkit-scrollbar-track { background: #121212; }
69
+ ::-webkit-scrollbar-thumb {
70
+ background: #333;
71
+ border-radius: 4px;
72
+ }
73
+ </style>
74
+ </head>
75
+ <body>
76
+ <div class="container">
77
+ <img src="your-image-url" class="model-image" alt="Model Visualization">
78
+ <div class="box">
79
+ <h2>🔍 Overview</h2>
80
+ <p>This is the second in a line of models dedicated to creating Stable-Diffusion prompts when given a character appearance. Made for the CharGen Project, This has been finetuned ontop of Delta-Vector/Holland-4B-V1</>
81
+ </div>
82
+ <div class="box">
83
+ <h2>⚖️ Quants</h2>
84
+ <p>Available quantization formats:</p>
85
+ <ul>
86
+ <li>GGUF: https://huggingface.co/mradermacher/SDPrompter4b-GGUF</li>
87
+ <li>EXL2: https://huggingface.co/</li>
88
+ </ul>
89
+ </div>
90
+ <div class="box">
91
+ <h2>💬 Prompting</h2>
92
+ <p><strong>Recommended format: ChatML, Use the following system prompt for the model. I'd advise against setting a high amount of output tokens as the model loops, use 0.1 min-p and temp-1 to keep it coherent.</strong></p>
93
+ <code>Create a prompt for Stable Diffusion based on the information below.</code>
94
+ </div>
95
+ <div class="box">
96
+ <h2>🌟 Credits</h2>
97
+ <p>Finetuned on 1xRTX6000 provided by Kubernetes_bad, All credits goes to Kubernetes_bad, LucyKnada and the rest of Anthracite.</p>
98
+ </div>
99
+ <div class="box">
100
+ <h2>🛠️ Axolotl Config)</h2>
101
+ <pre>
102
  base_model: Delta-Vector/Holland-4B-V1
103
  model_type: AutoModelForCausalLM
104
  tokenizer_type: AutoTokenizer
 
105
  trust_remote_code: true
 
106
  load_in_8bit: false
107
  load_in_4bit: false
108
  strict: false
 
109
  datasets:
110
  - path: NewEden/CivitAI-SD-Prompts
111
  datasets:
 
117
  message_field_role: from
118
  message_field_content: value
119
  train_on_eos: turn
 
120
  dataset_prepared_path:
121
  val_set_size: 0.02
122
  output_dir: ./outputs/out2
 
124
  sample_packing: true
125
  eval_sample_packing: false
126
  pad_to_sequence_len: true
 
127
  plugins:
128
  - axolotl.integrations.liger.LigerPlugin
129
  liger_rope: true
130
  liger_rms_norm: true
131
  liger_swiglu: true
132
  liger_fused_linear_cross_entropy: true
 
133
  wandb_project: SDprompter-final
134
  wandb_entity:
135
  wandb_watch:
136
  wandb_name: SDprompter-final
137
  wandb_log_model:
 
138
  gradient_accumulation_steps: 16
139
  micro_batch_size: 1
140
  num_epochs: 4
141
  optimizer: paged_adamw_8bit
142
  lr_scheduler: cosine
143
  learning_rate: 0.00001
 
144
  train_on_inputs: false
145
  group_by_length: false
146
  bf16: auto
147
  fp16:
148
  tf32: true
 
149
  gradient_checkpointing: true
150
  gradient_checkpointing_kwargs:
151
  use_reentrant: false
 
155
  logging_steps: 1
156
  xformers_attention:
157
  flash_attention: true
 
158
  warmup_ratio: 0.05
159
  evals_per_epoch: 4
160
  saves_per_epoch: 1
161
  debug:
162
  weight_decay: 0.01
 
163
  special_tokens:
164
  pad_token: <|finetune_right_pad_id|>
165
  eos_token: <|eot_id|>
 
166
  auto_resume_from_checkpoints: true
167
+ </pre>
168
+ </div>
169
+ </div>
170
+ </div>
171
+ </body>
172
+ </html>