Improve model card

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +193 -22
README.md CHANGED
@@ -1,35 +1,206 @@
1
  ---
2
  license: mit
 
3
  base_model:
4
  - Salesforce/codet5-base
5
- pipeline_tag: text2text-generation
6
  tags:
7
  - code
8
  - mathematics
9
  - theorem-proving
10
  ---
11
 
12
- Usage
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  ```python
14
- from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
15
- from transformers import pipeline
16
- model_name = "amitayusht/ProofWala-Multilingual"
17
- model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
18
- tokenizer = AutoTokenizer.from_pretrained(model_name)
19
- pipeline = pipeline("text2text-generation", model=model, tokenizer=tokenizer, device=-1) # device=0 for GPU, -1 for CPU
20
-
21
- # Example usage
22
- state = """
23
- Goals to prove:
24
- [GOALS]
25
- [GOAL] 1
26
- forall n : nat, n + 1 = 1 + n
27
- [END]
28
  """
29
- result = pipeline(state, max_length=100, num_return_sequences=1)
30
- print(result[0]['generated_text'])
31
- # Output:
32
- # [RUN TACTIC]
33
- # induction n; trivial.
34
- # [END]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  ```
 
1
  ---
2
  license: mit
3
+ library_name: transformers
4
  base_model:
5
  - Salesforce/codet5-base
6
+ pipeline_tag: text-generation
7
  tags:
8
  - code
9
  - mathematics
10
  - theorem-proving
11
  ---
12
 
13
+ # Model Card for CodeFuse-DeepSeek-33B
14
+ ![logo](LOGO.jpg)
15
+
16
+ [[中文]](#chinese) [[English]](#english)
17
+
18
+ Github: https://github.com/trishullab/proof-wala
19
+
20
+ <a id="english"></a>
21
+
22
+ ## Model Description
23
+
24
+ CodeFuse-DeepSeek-33B is a 33B Code-LLM finetuned by QLoRA on multiple code-related tasks on the base model DeepSeek-Coder-33B.
25
+
26
+ <br>
27
+
28
+ ## News and Updates
29
+
30
+ 🔥🔥🔥 2024-01-12 CodeFuse-DeepSeek-33B has been released, achieving a pass@1 (greedy decoding) score of 78.65% on HumanEval.
31
+
32
+ 🔥🔥🔥 2024-01-12 CodeFuse-Mixtral-8x7B has been released, achieving a pass@1 (greedy decoding) score of 56.1% on HumanEval, which is a 15% increase compared to Mixtral-8x7b's 40%.
33
+
34
+ 🔥🔥 2023-11-10 CodeFuse-CodeGeeX2-6B has been released, achieving a pass@1 (greedy decoding) score of 45.12% on HumanEval, which is a 9.22% increase compared to CodeGeeX2 35.9%.
35
+
36
+ 🔥🔥 2023-10-20 CodeFuse-QWen-14B technical documentation has been released. For those interested, please refer to the CodeFuse article on our WeChat official account via the provided link.(https://mp.weixin.qq.com/s/PCQPkvbvfxSPzsqjOILCDw)
37
+
38
+ 🔥🔥 2023-10-16 CodeFuse-QWen-14B has been released, achieving a pass@1 (greedy decoding) score of 48.78% on HumanEval, which is a 16% increase compared to Qwen-14b's 32.3%.
39
+
40
+ 🔥🔥 2023-09-27 CodeFuse-StarCoder-15B has been released, achieving a pass@1 (greedy decoding) score of 54.9% on HumanEval, which is a 21% increase compared to StarCoder's 33.6%.
41
+
42
+ 🔥🔥 2023-09-26 We are pleased to announce the release of the 4-bit quantized version of CodeFuse-CodeLlama-34B. Despite the quantization process, the model still achieves a remarkable 73.8% accuracy (greedy decoding) on the HumanEval pass@1 metric.
43
+
44
+ 🔥🔥 2023-09-11 CodeFuse-CodeLlama-34B has achieved 74.4% of pass@1 (greedy decoding) on HumanEval, which is SOTA results for openspurced LLMs at present.
45
+
46
+ <br>
47
+
48
+ ## Code Community
49
+
50
+ **Homepage**: 🏡 https://github.com/codefuse-ai (**Please give us your support with a Star🌟 + Fork🚀 + Watch👀**)
51
+
52
+ + If you wish to fine-tune the model yourself, you can visit ✨[MFTCoder](https://github.com/codefuse-ai/MFTCoder)✨✨
53
+
54
+
55
+ + If you wish to see a demo of the model, you can visit ✨[CodeFuse Demo](https://github.com/codefuse-ai/codefuse)✨✨
56
+
57
+ <br>
58
+
59
+ ## Performance
60
+
61
+ ### Code
62
+
63
+ | Model | HumanEval(pass@1) | Date |
64
+ |:----------------------------|:-----------------:|:-------:|
65
+ | **CodeFuse-DeepSeek-33B** | **78.65%** | 2024.01 |
66
+ | **CodeFuse-Mixtral-8x7B** | **56.10%** | 2024.01 |
67
+ | **CodeFuse-CodeLlama-34B** | 74.4% | 2023.9 |
68
+ |**CodeFuse-CodeLlama-34B-4bits** | 73.8% | 2023.9 |
69
+ | **CodeFuse-StarCoder-15B** | 54.9% | 2023.9 |
70
+ | **CodeFuse-QWen-14B** | 48.78% | 2023.10 |
71
+ | **CodeFuse-CodeGeeX2-6B** | 45.12% | 2023.11 |
72
+ | WizardCoder-Python-34B-V1.0 | 73.2% | 2023.8 |
73
+ | GPT-4(zero-shot) | 67.0% | 2023.3 |
74
+ | PanGu-Coder2 15B | 61.6% | 2023.8 |
75
+ | CodeLlama-34b-Python | 53.7% | 2023.8 |
76
+ | CodeLlama-34b | 48.8% | 2023.8 |
77
+ | GPT-3.5(zero-shot) | 48.1% | 2022.11 |
78
+ | OctoCoder | 46.2% | 2023.8 |
79
+ | StarCoder-15B | 33.6% | 2023.5 |
80
+ | Qwen-14b | 32.3% | 2023.10 |
81
+
82
+
83
+
84
+
85
+ ### NLP
86
+
87
+ ![NLP Performance Radar](codefuse-deepseek-33b-nlp.png)
88
+
89
+ <br>
90
+
91
+ ## Requirements
92
+
93
+ * python>=3.8
94
+ * pytorch>=2.0.0
95
+ * transformers>=4.33.2
96
+ * Sentencepiece
97
+ * CUDA 11.4
98
+ <br>
99
+
100
+ ## Inference String Format
101
+
102
+ The inference string is a concatenated string formed by combining conversation data(system, human and bot contents) in the training data format. It is used as input during the inference process.
103
+ Here are examples of prompts used to request the model:
104
+
105
+ **Multi-Round with System Prompt:**
106
  ```python
 
 
 
 
 
 
 
 
 
 
 
 
 
 
107
  """
108
+ <s>system
109
+ System instruction
110
+ <s>human
111
+ Human 1st round input
112
+ <s>bot
113
+ Bot 1st round output<|end of sentence|>
114
+ <s>human
115
+ Human 2nd round input
116
+ <s>bot
117
+ Bot 2nd round output<|end of sentence���>
118
+ ...
119
+ ...
120
+ ...
121
+ <s>human
122
+ Human nth round input
123
+ <s>bot
124
+ """
125
+ ```
126
+
127
+ **Single-Round without System Prompt:**
128
+ ```python
129
+ """
130
+ <s>human
131
+ User prompt...
132
+ <s>bot
133
+
134
+ """
135
+ ```
136
+
137
+ In this format, the system section is optional and the conversation can be either single-turn or multi-turn. When applying inference, you always make your input string end with "\<s\>bot" to ask the model generating answers.
138
+
139
+ For example, the format used to infer HumanEval is like the following:
140
+
141
+ ```
142
+ <s>human
143
+ # language: Python
144
+ from typing import List
145
+ def separate_paren_groups(paren_string: str) -> List[str]:
146
+ """ Input to this function is a string containing multiple groups of nested parentheses. Your goal is to
147
+ separate those group into separate strings and return the list of those.
148
+ Separate groups are balanced (each open brace is properly closed) and not nested within each other
149
+ Ignore any spaces in the input string.
150
+ >>> separate_paren_groups('( ) (( )) (( )( ))')
151
+ ['()', '(())', '(()())']
152
+ """
153
+ <s>bot
154
+
155
+ ```
156
+
157
+ Specifically, we also used the CodeGeeX series model's programming language distinction tag (e.g., for Python language, we use "```# language: Python```").
158
+
159
+ ## Quickstart
160
+
161
+
162
+ ```python
163
+ import torch
164
+ from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
165
+
166
+ model_dir = "codefuse-ai/CodeFuse-DeepSeek-33B"
167
+
168
+ def load_model_tokenizer(model_path):
169
+ tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
170
+ tokenizer.eos_token = "<|end of sentence|>"
171
+ tokenizer.pad_token = "<|end of sentence|>"
172
+ tokenizer.eos_token_id = tokenizer.convert_tokens_to_ids(tokenizer.eos_token)
173
+ tokenizer.pad_token_id = tokenizer.convert_tokens_to_ids(tokenizer.pad_token)
174
+ tokenizer.padding_side = "left"
175
+
176
+ model = AutoModelForCausalLM.from_pretrained(model_path, device_map='auto',torch_dtype=torch.bfloat16, trust_remote_code=True)
177
+ return model, tokenizer
178
+
179
+
180
+ HUMAN_ROLE_START_TAG = "<s>human\n"
181
+ BOT_ROLE_START_TAG = "<s>bot\n"
182
+
183
+ text_list = [f'{HUMAN_ROLE_START_TAG}Please write a quicksort program\n#Python\n{BOT_ROLE_START_TAG}']
184
+
185
+ model, tokenizer = load_model_tokenizer(model_dir)
186
+ inputs = tokenizer(text_list, return_tensors='pt', padding=True, add_special_tokens=False).to('cuda')
187
+ input_ids = inputs["input_ids"]
188
+ attention_mask = inputs["attention_mask"]
189
+ generation_config = GenerationConfig(
190
+ eos_token_id=tokenizer.eos_token_id,
191
+ pad_token_id=tokenizer.pad_token_id,
192
+ temperature=0.2,
193
+ max_new_tokens=512,
194
+ num_return_sequences=1,
195
+ num_beams=1,
196
+ top_p=0.95,
197
+ do_sample=False
198
+ )
199
+ outputs = model.generate(
200
+ inputs= input_ids,
201
+ attention_mask=attention_mask,
202
+ **generation_config.to_dict()
203
+ )
204
+ gen_text = tokenizer.batch_decode(outputs[:, input_ids.shape[1]:], skip_special_tokens=True)
205
+ print(gen_text[0])
206
  ```