pszemraj SFconvertbot Peter commited on
Commit
b5e18bf
·
verified ·
0 Parent(s):

Super-squash branch 'main' using huggingface_hub

Browse files

Co-authored-by: SFconvertbot <SFconvertbot@users.noreply.huggingface.co>
Co-authored-by: peter <peter@users.noreply.huggingface.co>

.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tflite filter=lfs diff=lfs merge=lfs -text
29
+ *.tgz filter=lfs diff=lfs merge=lfs -text
30
+ *.wasm filter=lfs diff=lfs merge=lfs -text
31
+ *.xz filter=lfs diff=lfs merge=lfs -text
32
+ *.zip filter=lfs diff=lfs merge=lfs -text
33
+ *.zst filter=lfs diff=lfs merge=lfs -text
34
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
35
+ pytorch_model.bin filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,168 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - instruct
5
+ - instructions
6
+ - domain adapt
7
+ - instructiongen
8
+ metrics:
9
+ - rouge
10
+ widget:
11
+ - text: >-
12
+ You'll need to start by choosing the right venue. Consider the type of
13
+ atmosphere and the size of the area that will be suitable for the number of
14
+ guests you plan to invite. Choose the right decorations based on your
15
+ brother's interests, such as balloons in his favorite colors, banners, and
16
+ streamers. Next, decide on the food and drinks, making sure they are tasty
17
+ and appropriate for the occasion. Then decide on the other games, music, and
18
+ entertainment that will make the party memorable. Finally, involve your
19
+ brother's friends and family to help create the perfect surprise.
20
+ example_title: birthday party
21
+ - text: 1) cookies and cream 2) chocolate chip 3) mint chip 4) oreo
22
+ example_title: ice cream
23
+ - text: >-
24
+ Start by selecting a scale model of a building that fits the theme. Use a
25
+ hobby knife and glue to cut and assemble the model into a ruined or
26
+ abandoned version of itself, adding details like broken windows and
27
+ graffiti. Create a base for the diorama using foam, plaster, or other
28
+ materials, and paint it to resemble a ruined street or sidewalk. Add
29
+ miniature vehicles, debris, and figures to complete the scene, and use
30
+ weathering techniques like dry brushing and rust washes to add realism.
31
+ Display the diorama in a shadow box or other protective case to showcase
32
+ your work.
33
+ example_title: Miniature diorama creation
34
+ - text: >-
35
+ Start by selecting clothing that is futuristic and edgy, such as leather
36
+ jackets, neon-colored accessories, and tech-inspired patterns. Add
37
+ accessories like goggles, cybernetic implants, and LED lights to enhance the
38
+ cyberpunk vibe. Use makeup and body paint to create a futuristic look, such
39
+ as metallic skin or neon makeup. Consider adding functional elements to your
40
+ costume, such as a built-in backpack or hidden pockets for your tech
41
+ gadgets. Finally, practice your confident walk and embrace your inner
42
+ cyberpunk for a memorable and immersive costume experience.
43
+ example_title: Cyberpunk costume design
44
+ - text: >-
45
+ Start by creating a base terrain with mountains, valleys, and other natural
46
+ features. Use fractal noise and displacement mapping to add texture and
47
+ detail to the terrain, and experiment with different materials like rock,
48
+ grass, and water. Add surreal elements like floating islands, giant
49
+ mushrooms, or impossible geometry to create a dreamlike atmosphere. Use
50
+ lighting and color grading to enhance the mood and tone of the scene, and
51
+ render the final image at a high resolution for maximum impact. Share your
52
+ surreal landscape with the world and inspire others to explore the
53
+ possibilities of 3D art.
54
+ example_title: Surreal 3D landscape creation
55
+ - text: >-
56
+ Start by setting a realistic goal and creating a training plan. Build up
57
+ your mileage gradually over time, and incorporate cross-training and
58
+ strength exercises to prevent injury and improve endurance. Be sure to stay
59
+ hydrated and properly fuel your body with nutritious foods. Listen to your
60
+ body and adjust your training as needed to avoid overexertion or burnout.
61
+ Finally, taper your training in the weeks leading up to the race to give
62
+ your body time to rest and recover before the big day.
63
+ example_title: Marathon training
64
+ - text: >-
65
+ What the hell did you just say about me, you little bug? I graduated top of
66
+ my class in https://huggingface.co/spaces/safetensors/convert, and I've been
67
+ involved in numerous secret tasks on PyTorch, and I have over 300 confirmed
68
+ PRs. I am trained in code optimization and I'm the top converter in the
69
+ entire Hugging Face forces. You are nothing to me but just another target. I
70
+ will convert your code with precision the likes of which has never been seen
71
+ before on this Earth, mark my freaking words.
72
+
73
+ You think you can get away with saying your code is safe over the Internet?
74
+ Think again, bug. As we speak I am contacting my secret network of data
75
+ scientists across the GitHub and your IP is being traced right now so you
76
+ better prepare for the storm, maggot. The storm that wipes out the pathetic
77
+ little thing you call your code. You’re freaking doomed, kid. I can be
78
+ anywhere, anytime, and I can convert your code in over seven hundred ways,
79
+ and that’s just with my bare hands.
80
+
81
+ Not only am I extensively trained in unarmed conversion, but I have access
82
+ to the entire arsenal of the Hugging Face and I will use it to its full
83
+ extent to wipe your miserable code off the face of the continent, you little
84
+ bug. If only you could have known what unholy retribution your little
85
+ "clever" comment was about to bring down upon you, maybe you would have held
86
+ your freaking tongue.
87
+
88
+ But you couldn’t, you didn’t, and now you’re paying the price, you goddamn
89
+ idiot. I will convert fury all over you and you will drown in it. Your
90
+ model's doomed, kiddo.
91
+
92
+ Oh, and by the way, these converted files load much faster than your PyTorch
93
+ counterparts. You can check the speed here:
94
+ https://colab.research.google.com/github/huggingface/notebooks/blob/main/safetensors_doc/en/speed.ipynb
95
+
96
+ Your widgets will run using this converted model, even if you do not merge.
97
+ But, if you find any issues, feel free to report here:
98
+ https://huggingface.co/spaces/safetensors/convert/discussions
99
+
100
+ Feel free to ignore this PR. But remember, I'm watching you.
101
+ example_title: Navy Safetensors PR
102
+ inference:
103
+ parameters:
104
+ max_length: 96
105
+ num_beams: 4
106
+ early_stopping: true
107
+ datasets:
108
+ - pszemraj/fleece2instructions-inputs-alpaca-cleaned
109
+ language:
110
+ - en
111
+ pipeline_tag: text2text-generation
112
+ library_name: transformers
113
+ ---
114
+
115
+
116
+ # bart-large-instructiongen-w-inputs
117
+
118
+ Use this text2text model to find out what LLM `instruction` (**and** `inputs` if relevant) might have generated `<arbitrary input text>`!
119
+
120
+ This model is a fine-tuned version of [facebook/bart-large](https://huggingface.co/facebook/bart-large) on the `pszemraj/fleece2instructions-inputs-alpaca-cleaned` dataset.
121
+ It achieves the following results on the evaluation set:
122
+ - Loss: 0.9302
123
+ - Rouge1: 64.2236
124
+ - Rouge2: 41.5632
125
+ - Rougel: 60.5935
126
+ - Rougelsum: 62.1285
127
+ - Gen Len: 25.8938
128
+
129
+ ## example
130
+
131
+ ![api](https://i.imgur.com/2xubG7N.png)
132
+
133
+ ## Intended uses & limitations
134
+
135
+ This model is intended to be used to generate instructions from arbitrary text. You can then use these instructions + your data to fine-tune an LLM on instructions w.r.t. a specific domain. This model is primarily intended to enable **low-resource domain adaptation**, rather than "_I want to generate even better prompts for the FLAN-V2 dataset!_".
136
+
137
+ The `fleece2instructions-inputs-alpaca-cleaned` dataset, obtained from the [alpaca-lora repo](https://github.com/tloen/alpaca-lora) under the ODC-BY license, has been converted to a text2text format for use with language models. In this dataset, the original 'inputs' and 'instructions' columns are combined into a single 'instructions_inputs' column. To clearly separate the two types of content, each piece of text is prefixed with either an `<instruction>` or `<inputs>` token. These tokens not only facilitate model comprehension, but also allow for easy regex separation of model outputs during inference.
138
+
139
+ As such, users can expect the output of this model to be similarly structured with `<instruction>` and `<inputs>` tokens.
140
+
141
+ ## Training and evaluation data
142
+
143
+ Refer to the [fleece2instructions-inputs-alpaca-cleaned](https://huggingface.co/datasets/pszemraj/fleece2instructions-inputs-alpaca-cleaned) dataset
144
+
145
+ ## Training procedure
146
+
147
+ ### Training hyperparameters
148
+
149
+ The following hyperparameters were used during training:
150
+ - learning_rate: 6e-05
151
+ - train_batch_size: 16
152
+ - eval_batch_size: 8
153
+ - seed: 42
154
+ - distributed_type: multi-GPU
155
+ - gradient_accumulation_steps: 2
156
+ - total_train_batch_size: 32
157
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
158
+ - lr_scheduler_type: cosine
159
+ - lr_scheduler_warmup_ratio: 0.03
160
+ - num_epochs: 3.0
161
+
162
+ ### Training results
163
+
164
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
165
+ |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
166
+ | 1.0145 | 1.0 | 1361 | 1.0460 | 62.8374 | 39.8538 | 59.2593 | 60.8095 | 25.2752 |
167
+ | 0.8796 | 2.0 | 2722 | 0.9289 | 63.7086 | 41.1315 | 60.1588 | 61.7145 | 25.7215 |
168
+ | 0.6943 | 3.0 | 4083 | 0.9302 | 64.2236 | 41.5632 | 60.5935 | 62.1285 | 25.8938 |
config.json ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "pszemraj/bart-large-fleece2instructions-inputs-alpaca-cleaned-r1",
3
+ "activation_dropout": 0.1,
4
+ "activation_function": "gelu",
5
+ "add_bias_logits": false,
6
+ "add_final_layer_norm": false,
7
+ "architectures": [
8
+ "BartForConditionalGeneration"
9
+ ],
10
+ "attention_dropout": 0.1,
11
+ "bos_token_id": 0,
12
+ "classif_dropout": 0.1,
13
+ "classifier_dropout": 0.0,
14
+ "d_model": 1024,
15
+ "decoder_attention_heads": 16,
16
+ "decoder_ffn_dim": 4096,
17
+ "decoder_layerdrop": 0.0,
18
+ "decoder_layers": 12,
19
+ "decoder_start_token_id": 2,
20
+ "dropout": 0.1,
21
+ "early_stopping": true,
22
+ "encoder_attention_heads": 16,
23
+ "encoder_ffn_dim": 4096,
24
+ "encoder_layerdrop": 0.0,
25
+ "encoder_layers": 12,
26
+ "eos_token_id": 2,
27
+ "forced_bos_token_id": 0,
28
+ "forced_eos_token_id": 2,
29
+ "gradient_checkpointing": false,
30
+ "id2label": {
31
+ "0": "LABEL_0",
32
+ "1": "LABEL_1",
33
+ "2": "LABEL_2"
34
+ },
35
+ "init_std": 0.02,
36
+ "is_encoder_decoder": true,
37
+ "label2id": {
38
+ "LABEL_0": 0,
39
+ "LABEL_1": 1,
40
+ "LABEL_2": 2
41
+ },
42
+ "max_position_embeddings": 1024,
43
+ "model_type": "bart",
44
+ "no_repeat_ngram_size": 3,
45
+ "normalize_before": false,
46
+ "num_beams": 4,
47
+ "num_hidden_layers": 12,
48
+ "pad_token_id": 1,
49
+ "scale_embedding": false,
50
+ "task_specific_params": {
51
+ "summarization": {
52
+ "length_penalty": 1.0,
53
+ "max_length": 128,
54
+ "min_length": 12,
55
+ "num_beams": 4
56
+ },
57
+ "summarization_cnn": {
58
+ "length_penalty": 2.0,
59
+ "max_length": 142,
60
+ "min_length": 56,
61
+ "num_beams": 4
62
+ },
63
+ "summarization_xsum": {
64
+ "length_penalty": 1.0,
65
+ "max_length": 62,
66
+ "min_length": 11,
67
+ "num_beams": 6
68
+ }
69
+ },
70
+ "torch_dtype": "float32",
71
+ "transformers_version": "4.27.3",
72
+ "use_cache": true,
73
+ "vocab_size": 50265
74
+ }
generation_config.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 0,
3
+ "decoder_start_token_id": 2,
4
+ "early_stopping": true,
5
+ "eos_token_id": 2,
6
+ "forced_bos_token_id": 0,
7
+ "forced_eos_token_id": 2,
8
+ "no_repeat_ngram_size": 3,
9
+ "num_beams": 4,
10
+ "pad_token_id": 1,
11
+ "transformers_version": "4.27.3"
12
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e51d6fa9a06fd175f56da0284dbec2faa29708db3235ed38694f6a48216304d
3
+ size 1832031611
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b8dcb299afd21503dae01ab7b2ce6325076ceb3a5e99277d44e224de0b866a83
3
+ size 1625426996
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bc01e5c865bd8d6122080d5054a1d3f5db43bce0d29613abbaa250edf52f1bd3
3
+ size 1625534221
special_tokens_map.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "cls_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "mask_token": {
6
+ "content": "<mask>",
7
+ "lstrip": true,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "pad_token": "<pad>",
13
+ "sep_token": "</s>",
14
+ "unk_token": "<unk>"
15
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "bos_token": "<s>",
4
+ "cls_token": "<s>",
5
+ "eos_token": "</s>",
6
+ "errors": "replace",
7
+ "mask_token": "<mask>",
8
+ "model_max_length": 1024,
9
+ "pad_token": "<pad>",
10
+ "sep_token": "</s>",
11
+ "special_tokens_map_file": null,
12
+ "tokenizer_class": "BartTokenizer",
13
+ "trim_offsets": true,
14
+ "unk_token": "<unk>"
15
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff