--- license: other model-index: - name: SUS-Chat-72B results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 66.3 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=SUSTech/SUS-Chat-72B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 84.96 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=SUSTech/SUS-Chat-72B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 76.7 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=SUSTech/SUS-Chat-72B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 60.27 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=SUSTech/SUS-Chat-72B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 83.43 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=SUSTech/SUS-Chat-72B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 9.4 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=SUSTech/SUS-Chat-72B name: Open LLM Leaderboard --- --- --- # 🐷SUS-Chat: Instruction tuning done right not currently in use

中文｜ English

# 🐷SUS-Chat: Instruction tuning done right

中文｜ English

# News - 2023-12-09: 🔥 `Tigerbot` variant has been [deleted](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/438), `SUS-Chat-34B` is now the the top-ranked LLaMA model and the top-ranked chat model. - 2023-12-07: SUS-Chat-34B is now available on [WiseModel🧠](https://wisemodel.cn/model/SUSTech/SUS-Chat-34B). - 2023-12-06: Try [SUS-Chat-34B chat-ui](https://huggingface.co/spaces/SUSTech/SUS-Chat-34B). - 2023-12-05: SUS-Chat-34B is now available on [ModelScope🤖](https://www.modelscope.cn/models/SUSTC/SUS-Chat-34B/summary) - 2023-12-05: SUS-Chat-34B is ranked 2nd in [Open LLM leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) and surpassed all models under 70B. - 2023-12-01: SUS-Chat-34B is now available on [HuggingFace🤗](https://huggingface.co/SUSTech/SUS-Chat-34B). # Introduction # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_SUSTech__SUS-Chat-72B) | Metric |Value| |---------------------------------|----:| |Avg. |63.51| |AI2 Reasoning Challenge (25-Shot)|66.30| |HellaSwag (10-Shot) |84.96| |MMLU (5-Shot) |76.70| |TruthfulQA (0-shot) |60.27| |Winogrande (5-shot) |83.43| |GSM8k (5-shot) | 9.40|