| | --- |
| | language: |
| | - en |
| | license: apache-2.0 |
| | tags: |
| | - code |
| | - mathematics |
| | datasets: |
| | - ajibawa-2023/Code-290k-ShareGPT |
| | - m-a-p/Code-Feedback |
| | - microsoft/orca-math-word-problems-200k |
| | - teknium/openhermes |
| | model-index: |
| | - name: Code-Mistral-7B |
| | results: |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: AI2 Reasoning Challenge (25-Shot) |
| | type: ai2_arc |
| | config: ARC-Challenge |
| | split: test |
| | args: |
| | num_few_shot: 25 |
| | metrics: |
| | - type: acc_norm |
| | value: 64.59 |
| | name: normalized accuracy |
| | source: |
| | url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ajibawa-2023/Code-Mistral-7B |
| | name: Open LLM Leaderboard |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: HellaSwag (10-Shot) |
| | type: hellaswag |
| | split: validation |
| | args: |
| | num_few_shot: 10 |
| | metrics: |
| | - type: acc_norm |
| | value: 85.29 |
| | name: normalized accuracy |
| | source: |
| | url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ajibawa-2023/Code-Mistral-7B |
| | name: Open LLM Leaderboard |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: MMLU (5-Shot) |
| | type: cais/mmlu |
| | config: all |
| | split: test |
| | args: |
| | num_few_shot: 5 |
| | metrics: |
| | - type: acc |
| | value: 65.0 |
| | name: accuracy |
| | source: |
| | url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ajibawa-2023/Code-Mistral-7B |
| | name: Open LLM Leaderboard |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: TruthfulQA (0-shot) |
| | type: truthful_qa |
| | config: multiple_choice |
| | split: validation |
| | args: |
| | num_few_shot: 0 |
| | metrics: |
| | - type: mc2 |
| | value: 54.64 |
| | source: |
| | url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ajibawa-2023/Code-Mistral-7B |
| | name: Open LLM Leaderboard |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: Winogrande (5-shot) |
| | type: winogrande |
| | config: winogrande_xl |
| | split: validation |
| | args: |
| | num_few_shot: 5 |
| | metrics: |
| | - type: acc |
| | value: 82.24 |
| | name: accuracy |
| | source: |
| | url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ajibawa-2023/Code-Mistral-7B |
| | name: Open LLM Leaderboard |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: GSM8k (5-shot) |
| | type: gsm8k |
| | config: main |
| | split: test |
| | args: |
| | num_few_shot: 5 |
| | metrics: |
| | - type: acc |
| | value: 68.08 |
| | name: accuracy |
| | source: |
| | url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ajibawa-2023/Code-Mistral-7B |
| | name: Open LLM Leaderboard |
| | --- |
| | |
| | **Code-Mistral-7B** |
| |
|
| |
|
| | This Model is trained on refined version of my dataset [Code-290k-ShareGPT](https://huggingface.co/datasets/ajibawa-2023/Code-290k-ShareGPT). |
| |
|
| | Besides this it is trained on following datasets: |
| |
|
| | [Code-Feedback](https://huggingface.co/datasets/m-a-p/Code-Feedback) |
| |
|
| | [orca-math-word-problems-200k](https://huggingface.co/datasets/microsoft/orca-math-word-problems-200k) |
| |
|
| | [Openhermes](https://huggingface.co/datasets/teknium/openhermes) |
| |
|
| | The idea was to check how this Model will perform with both Code & Maths datasets. This model is very good with Coding. |
| | Maths is still hit & miss but you can test out this model. |
| |
|
| | This Model is trained on massive datasets so the results are very good. |
| | I have used ChatML prompt format. |
| |
|
| | Kindly note this is qLoRA version, a rare exception. |
| |
|
| | **GGUF & Exllama** |
| |
|
| | GGUF: [Link](https://huggingface.co/bartowski/Code-Mistral-7B-GGUF) |
| |
|
| | Exllama v2: [Link](https://huggingface.co/bartowski/Code-Mistral-7B-exl2) |
| |
|
| | Special Thanks to [Bartowski](https://huggingface.co/bartowski) for quantizing this model. |
| |
|
| |
|
| | **Training:** |
| |
|
| | Entire dataset was trained on 4 x A100 80GB. For 3 epoch, training took almost 33 Hours. Axolotl codebase was used for training purpose. |
| | Entire data is trained on Mistral. |
| |
|
| | **Example Prompt:** |
| | This model uses **ChatML** prompt format. |
| |
|
| | ``` |
| | <|im_start|>system |
| | You are a helpful AI assistant.<|im_end|> |
| | <|im_start|>user |
| | {prompt}<|im_end|> |
| | <|im_start|>assistant |
| | |
| | ``` |
| | You can modify above Prompt as per your requirement. |
| |
|
| |
|
| | I want to say special Thanks to the Open Source community for helping & guiding me to better understand the AI/Model development. |
| |
|
| | Thank you for your love & support. |
| |
|
| |
|
| | **Example Output** |
| |
|
| |
|
| | **C++** |
| |
|
| |  |
| |
|
| | **Error Resolving** |
| |
|
| |  |
| |
|
| | **Matrices** |
| |
|
| |  |
| |
|
| | **Machine Learning** |
| |
|
| |  |
| |
|
| |
|
| |
|
| | # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) |
| | Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ajibawa-2023__Code-Mistral-7B) |
| |
|
| | | Metric |Value| |
| | |---------------------------------|----:| |
| | |Avg. |69.97| |
| | |AI2 Reasoning Challenge (25-Shot)|64.59| |
| | |HellaSwag (10-Shot) |85.29| |
| | |MMLU (5-Shot) |65.00| |
| | |TruthfulQA (0-shot) |54.64| |
| | |Winogrande (5-shot) |82.24| |
| | |GSM8k (5-shot) |68.08| |
| |
|
| |
|