munish0838 commited on
Commit
0c4849b
·
verified ·
1 Parent(s): 129fc8a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +165 -0
README.md ADDED
@@ -0,0 +1,165 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ language:
5
+ - ru
6
+
7
+ ---
8
+
9
+ ![](https://cdn.discordapp.com/attachments/791342238541152306/1264099835221381251/image.png?ex=669ca436&is=669b52b6&hm=129f56187c31e1ed22cbd1bcdbc677a2baeea5090761d2f1a458c8b1ec7cca4b&)
10
+
11
+ # QuantFactory/T-lite-instruct-0.1-GGUF
12
+ This is quantized version of [AnatoliiPotapov/T-lite-instruct-0.1](https://huggingface.co/AnatoliiPotapov/T-lite-instruct-0.1) created using llama.cpp
13
+
14
+ # Original Model Card
15
+
16
+
17
+ # T-lite-instruct-0.1
18
+
19
+ **🚨 T-lite is designed for further fine-tuning and is not intended as a ready-to-use conversational assistant. Users are advised to exercise caution and are responsible for any additional training and oversight required to ensure the model's responses meet acceptable ethical and safety standards. The responsibility for incorporating this model into industrial or commercial solutions lies entirely with those who choose to deploy it.**
20
+
21
+
22
+ ## Description
23
+
24
+ T-lite-instruct-0.1 is an instruct version of the T-lite-0.1 model.
25
+
26
+ T-lite-instruct-0.1 was trained in bf16.
27
+
28
+
29
+ ### 📚 Dataset
30
+
31
+ #### Contexts
32
+ For the instruction dataset, the contexts are obtained from:
33
+ - Open Source English-language datasets (such as UltraFeedback, HelpSteer, SHP, and so on)
34
+ - Translations of English-language datasets through machine translation
35
+ - Synthetic grounded QA contexts, generated from pre-training datasets
36
+
37
+ The translated contexts are filtered using classifiers.
38
+
39
+ #### SFT
40
+ The responses to the contexts are generated by a strong model and the training is exclusively carried out on these responses. This avoids training the model on poor-quality translations.
41
+
42
+ #### Reward Modeling
43
+ RM is trained on such pairs:
44
+ - Strong Model > Our Model
45
+ - Stronger Model > Weaker Model
46
+ - Chosen Translated Response > Rejected Translated Response
47
+ - Pairs from original English datasets
48
+
49
+ The translated preference data are preliminarily filtered by the RM ensemble.
50
+
51
+ #### Preference tuning
52
+ Two stages were used in preference tuning:
53
+ - Stage 1: SPiN on the responses of the teacher model (Strong Model > Our Model)
54
+ - Stage 2: SLiC-HF using our RM
55
+
56
+
57
+ ## 📊 Benchmarks
58
+
59
+ Here we present the results of T-lite-instruct-0.1 on automatic benchmarks.
60
+
61
+ ### 🏆 [MT-Bench](https://huggingface.co/spaces/lmsys/mt-bench)
62
+
63
+ This benchmark was carefully translated into Russian and measured with [LLM Judge](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge) codebase, using gpt-4-1106-preview as a judge.
64
+
65
+ <style>
66
+ table {
67
+ width: auto;
68
+ }
69
+ th, td {
70
+ padding: 5px;
71
+ }
72
+ </style>
73
+ | MT-Bench | Total | Turn_1 | Turn_2 | coding | humanities | math | reasoning | roleplay | stem | writing |
74
+ |-----------------------------------------------------------------|:-----------:|:------------:|:------------:|:------:|:----------:|:----:|:---------:|:--------:|:----:|:-------:|
75
+ | **T-lite-instruct-0.1** | **6.458** | **6.833** | 6.078 | 4.136 | **8.45** | 4.25 | **4.5** |**7.667** |**7.7**| 7.706 |
76
+ | gpt3.5-turbo-0125 | 6.373 | 6.423 | **6.320** |**6.519**| 7.474 | 4.75 | 4.15 | 6.333 | 6.7 | 7.588 |
77
+ | suzume-llama-3-8B-multilingual-orpo-borda-half | 6.051 | 6.577 | 5.526 | 4.318 | 8.0 | 4.0 | 3.6 | 7.056 | 6.7 | **7.889** |
78
+ | Qwen2-7b-Instruct | 6.026 | 6.449 | 5.603 | 5.0 | 6.95 |**5.8**| 4.15 | 7.167 | 5.85 | 7.278 |
79
+ | Llama-3-8b-Instruct | 5.948 | 6.662 | 5.224 | 4.727 | 7.8 | 3.9 | 2.8 | 7.333 | 6.053 | 7.0 |
80
+ | suzume-llama-3-8B-multilingual | 5.808 | 6.167 | 5.449 | 5.409 | 6.4 | 5.05 | 3.8 | 6.556 | 5.0 | 7.056 |
81
+ | saiga_llama3_8b | 5.471 | 5.896 | 5.039 | 3.0 | 7.4 | 3.55 | 3.5 | 6.444 | 5.15 | 7.812 |
82
+ | Mistral-7B-Instruct-v0.3 | 5.135 | 5.679 | 4.584 | 4.045 | 6.35 | 3.15 | 3.2 | 5.765 | 5.2 | 7.333 |
83
+
84
+
85
+ ### 🏟️ [Arena](https://github.com/lm-sys/arena-hard-auto)
86
+
87
+ We used Russian version of Arena benchmark from [Vikhrmodels](https://huggingface.co/datasets/Vikhrmodels/ru-arena-general) and [Arena Hard Auto](https://github.com/lm-sys/arena-hard-auto) codebase
88
+ for evaluation. As baseline model we chose gpt3.5-turbo-0125 and the judge was gpt-4-1106-preview.
89
+
90
+ <style>
91
+ table {
92
+ width: auto;
93
+ }
94
+ th, td {
95
+ padding: 5px;
96
+ }
97
+ </style>
98
+ | Arena General | Score | 95% CI | Average Tokens |
99
+ |-----------------------------------------------------------------|:-----------:|:------------:|:--------------:|
100
+ | **T-lite-instruct-0.1** | **57.26** | -2.9/2 | 870 |
101
+ | gpt3.5-turbo-0125 | 50 | 0/0 | 254 |
102
+ | suzume-llama-3-8B-multilingual-orpo-borda-half | 47.17 | -2.6/2.4 | 735 |
103
+ | Llama-3-8b-Instruct | 42.16 | -2.1/2.1 | 455 |
104
+ | saiga_llama3_8b | 39.88 | -2.3/2.5 | 616 |
105
+ | suzume-llama-3-8B-multilingual | 38.25 | -1.7/1.7 | 625 |
106
+ | Qwen2-7b-Instruct | 33.42 | -1.9/2.2 | 365 |
107
+ | Mistral-7B-Instruct-v0.3 | 28.11 | -2/2.2 | 570 |
108
+
109
+
110
+ ## 👨‍💻 Examples of usage
111
+
112
+
113
+ ```python
114
+ from transformers import AutoTokenizer, AutoModelForCausalLM
115
+ import torch
116
+ torch.manual_seed(42)
117
+
118
+ model_name = "t-bank-ai/T-lite-instruct-0.1"
119
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
120
+ model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
121
+
122
+ messages = [
123
+ {"role": "user", "content": "Напиши рецепт классной пиццы!"},
124
+ ]
125
+
126
+ input_ids = tokenizer.apply_chat_template(
127
+ messages,
128
+ add_generation_prompt=True,
129
+ return_tensors="pt"
130
+ ).to(model.device)
131
+
132
+ terminators = [
133
+ tokenizer.eos_token_id,
134
+ tokenizer.convert_tokens_to_ids("<|eot_id|>")
135
+ ]
136
+
137
+ outputs = model.generate(
138
+ input_ids,
139
+ max_new_tokens=256,
140
+ eos_token_id=terminators,
141
+ )
142
+
143
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
144
+ ```
145
+
146
+ Output:
147
+ ```
148
+ Конечно, вот рецепт для вкусной домашней пиццы, который можно адаптировать под разные вкусы и предпочтения. Важно, чтобы тесто было мягким и воздушным, а начинка — сочной и ароматной.
149
+
150
+ ### Ингредиенты для теста:
151
+ - 500 г муки (лучше использовать смесь пшеничной и цельнозерновой)
152
+ - 1 ч. л. сухих дрожжей (или 7 г свежих)
153
+ - 1 ч. л. сахара
154
+ - 1 ч. л. соли
155
+ - 1 ст. л. оливкового масла
156
+ - 300 мл тёплой воды
157
+ - 1 яйцо (для смазки)
158
+
159
+ ### Ингредиенты для начинки (примерный набор):
160
+ - 200 г томатного соуса (можно сделать самому из свежих помидоров или использовать готовый)
161
+ - 200 г моцареллы, нарезанной ломтиками
162
+ - 100 г сыра пармезан (тертый)
163
+ - 100 г ветчины или колбасы
164
+ - 100 г грибов (шампин
165
+ ```