amitayusht
/

ProofWala-Multilingual

@@ -1,35 +1,206 @@
 ---
 license: mit
 base_model:
 - Salesforce/codet5-base
-pipeline_tag: text2text-generation
 tags:
 - code
 - mathematics
 - theorem-proving
 ---
-Usage
 ```python
-from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
-from transformers import pipeline
-model_name = "amitayusht/ProofWala-Multilingual"
-model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
-tokenizer = AutoTokenizer.from_pretrained(model_name)
-pipeline = pipeline("text2text-generation", model=model, tokenizer=tokenizer, device=-1) # device=0 for GPU, -1 for CPU
-# Example usage
-state = """
-Goals to prove:
-[GOALS]
-[GOAL] 1
-forall n : nat, n + 1 = 1 + n
-[END]
 """
-result = pipeline(state, max_length=100, num_return_sequences=1)
-print(result[0]['generated_text'])
-# Output:
-# [RUN TACTIC]
-# induction n; trivial.
-# [END]
 ```

 ---
 license: mit
+library_name: transformers
 base_model:
 - Salesforce/codet5-base
+pipeline_tag: text-generation
 tags:
 - code
 - mathematics
 - theorem-proving
 ---
+# Model Card for CodeFuse-DeepSeek-33B
+![logo](LOGO.jpg)
+[[中文]](#chinese)    [[English]](#english)
+Github: https://github.com/trishullab/proof-wala
+<a id="english"></a>
+## Model Description
+CodeFuse-DeepSeek-33B is a 33B Code-LLM finetuned by QLoRA on multiple code-related tasks on the base model DeepSeek-Coder-33B.
+<br>
+## News and Updates
+🔥🔥🔥 2024-01-12 CodeFuse-DeepSeek-33B has been released, achieving a pass@1 (greedy decoding) score of 78.65% on HumanEval.
+🔥🔥🔥 2024-01-12 CodeFuse-Mixtral-8x7B has been released, achieving a pass@1 (greedy decoding) score of 56.1% on HumanEval, which is a 15% increase compared to Mixtral-8x7b's 40%.
+🔥🔥 2023-11-10 CodeFuse-CodeGeeX2-6B has been released, achieving a pass@1 (greedy decoding) score of 45.12% on HumanEval, which is a 9.22% increase compared to CodeGeeX2 35.9%.
+🔥🔥 2023-10-20 CodeFuse-QWen-14B technical documentation has been released. For those interested, please refer to the CodeFuse article on our WeChat official account via the provided link.(https://mp.weixin.qq.com/s/PCQPkvbvfxSPzsqjOILCDw)
+🔥🔥 2023-10-16 CodeFuse-QWen-14B has been released, achieving a pass@1 (greedy decoding) score of 48.78% on HumanEval, which is a 16% increase compared to Qwen-14b's 32.3%.
+🔥🔥 2023-09-27 CodeFuse-StarCoder-15B has been released, achieving a pass@1 (greedy decoding) score of 54.9% on HumanEval, which is a 21% increase compared to StarCoder's 33.6%.
+🔥🔥 2023-09-26 We are pleased to announce the release of the 4-bit quantized version of CodeFuse-CodeLlama-34B. Despite the quantization process, the model still achieves a remarkable 73.8% accuracy (greedy decoding) on the HumanEval pass@1 metric.
+🔥🔥 2023-09-11 CodeFuse-CodeLlama-34B has achieved 74.4% of pass@1 (greedy decoding) on HumanEval, which is SOTA results for openspurced LLMs at present.
+<br>
+## Code Community
+**Homepage**: 🏡 https://github.com/codefuse-ai (**Please give us your support with a Star🌟 + Fork🚀 + Watch👀**)
++ If you wish to fine-tune the model yourself, you can visit ✨[MFTCoder](https://github.com/codefuse-ai/MFTCoder)✨✨
++ If you wish to see a demo of the model, you can visit ✨[CodeFuse Demo](https://github.com/codefuse-ai/codefuse)✨✨
+<br>
+## Performance
+### Code
+| Model                       | HumanEval(pass@1) |  Date   |
+|:----------------------------|:-----------------:|:-------:|
+| **CodeFuse-DeepSeek-33B**   |     **78.65%**    | 2024.01 |
+| **CodeFuse-Mixtral-8x7B**   |     **56.10%**    | 2024.01 |
+| **CodeFuse-CodeLlama-34B**  |     74.4%      | 2023.9  |
+|**CodeFuse-CodeLlama-34B-4bits** |     73.8%  |  2023.9 |
+| **CodeFuse-StarCoder-15B**  |     54.9%         | 2023.9  |
+| **CodeFuse-QWen-14B**       |     48.78%        | 2023.10 |
+| **CodeFuse-CodeGeeX2-6B**   |     45.12%        | 2023.11 |
+| WizardCoder-Python-34B-V1.0 |       73.2%       | 2023.8  |
+| GPT-4(zero-shot)            |       67.0%       | 2023.3  |
+| PanGu-Coder2 15B            |       61.6%       | 2023.8  |
+| CodeLlama-34b-Python        |       53.7%       | 2023.8  |
+| CodeLlama-34b               |       48.8%       | 2023.8  |
+| GPT-3.5(zero-shot)          |       48.1%       | 2022.11 |
+| OctoCoder                   |       46.2%       | 2023.8  |
+| StarCoder-15B               |       33.6%       | 2023.5  |
+| Qwen-14b                    |       32.3%       | 2023.10 |
+### NLP
+![NLP Performance Radar](codefuse-deepseek-33b-nlp.png)
+<br>
+## Requirements
+* python>=3.8
+* pytorch>=2.0.0
+* transformers>=4.33.2
+* Sentencepiece
+* CUDA 11.4
+  <br>
+##  Inference String Format
+The inference string is a concatenated string formed by combining conversation data(system, human and bot contents) in the training data format.  It is used as input during the inference process.
+Here are examples of prompts used to request the model:
+**Multi-Round with System Prompt:**
 ```python
 """
+<s>system
+System instruction
+<s>human
+Human 1st round input
+<s>bot
+Bot 1st round output<｜end of sentence｜>
+<s>human
+Human 2nd round input
+<s>bot
+Bot 2nd round output<｜end of sentence���>
+...
+...
+...
+<s>human
+Human nth round input
+<s>bot
+"""
+```
+**Single-Round without System Prompt:**
+```python
+"""
+<s>human
+User prompt...
+<s>bot
+"""
+```
+In this format, the system section is optional and the conversation can be either single-turn or multi-turn. When applying inference, you always make your input string end with "\<s\>bot" to ask the model generating answers.
+For example, the format used to infer HumanEval is like the following:
+```
+<s>human
+# language: Python
+from typing import List
+def separate_paren_groups(paren_string: str) -> List[str]:
+    """ Input to this function is a string containing multiple groups of nested parentheses. Your goal is to
+    separate those group into separate strings and return the list of those.
+    Separate groups are balanced (each open brace is properly closed) and not nested within each other
+    Ignore any spaces in the input string.
+    >>> separate_paren_groups('( ) (( )) (( )( ))')
+    ['()', '(())', '(()())']
+    """
+<s>bot
+```
+Specifically, we also used the CodeGeeX series model's programming language distinction tag (e.g., for Python language, we use "```# language: Python```").
+## Quickstart
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
+model_dir = "codefuse-ai/CodeFuse-DeepSeek-33B"
+def load_model_tokenizer(model_path):
+    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
+    tokenizer.eos_token = "<｜end of sentence｜>"
+    tokenizer.pad_token = "<｜end of sentence｜>"
+    tokenizer.eos_token_id = tokenizer.convert_tokens_to_ids(tokenizer.eos_token)
+    tokenizer.pad_token_id = tokenizer.convert_tokens_to_ids(tokenizer.pad_token)
+    tokenizer.padding_side = "left"
+    model = AutoModelForCausalLM.from_pretrained(model_path, device_map='auto',torch_dtype=torch.bfloat16, trust_remote_code=True)
+    return model, tokenizer
+HUMAN_ROLE_START_TAG = "<s>human\n"
+BOT_ROLE_START_TAG = "<s>bot\n"
+text_list = [f'{HUMAN_ROLE_START_TAG}Please write a quicksort program\n#Python\n{BOT_ROLE_START_TAG}']
+model, tokenizer = load_model_tokenizer(model_dir)
+inputs = tokenizer(text_list, return_tensors='pt', padding=True, add_special_tokens=False).to('cuda')
+input_ids = inputs["input_ids"]
+attention_mask = inputs["attention_mask"]
+generation_config = GenerationConfig(
+        eos_token_id=tokenizer.eos_token_id,
+        pad_token_id=tokenizer.pad_token_id,
+        temperature=0.2,
+        max_new_tokens=512,
+        num_return_sequences=1,
+        num_beams=1,
+        top_p=0.95,
+        do_sample=False
+)
+outputs = model.generate(
+        inputs= input_ids,
+        attention_mask=attention_mask,
+        **generation_config.to_dict()
+)
+gen_text = tokenizer.batch_decode(outputs[:, input_ids.shape[1]:], skip_special_tokens=True)
+print(gen_text[0])
 ```