| | --- |
| | license: mit |
| | model-index: |
| | - name: pygmalion-instruct |
| | results: |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: AI2 Reasoning Challenge (25-Shot) |
| | type: ai2_arc |
| | config: ARC-Challenge |
| | split: test |
| | args: |
| | num_few_shot: 25 |
| | metrics: |
| | - type: acc_norm |
| | value: 52.56 |
| | name: normalized accuracy |
| | source: |
| | url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=AlpinDale/pygmalion-instruct |
| | name: Open LLM Leaderboard |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: HellaSwag (10-Shot) |
| | type: hellaswag |
| | split: validation |
| | args: |
| | num_few_shot: 10 |
| | metrics: |
| | - type: acc_norm |
| | value: 77.65 |
| | name: normalized accuracy |
| | source: |
| | url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=AlpinDale/pygmalion-instruct |
| | name: Open LLM Leaderboard |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: MMLU (5-Shot) |
| | type: cais/mmlu |
| | config: all |
| | split: test |
| | args: |
| | num_few_shot: 5 |
| | metrics: |
| | - type: acc |
| | value: 35.94 |
| | name: accuracy |
| | source: |
| | url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=AlpinDale/pygmalion-instruct |
| | name: Open LLM Leaderboard |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: TruthfulQA (0-shot) |
| | type: truthful_qa |
| | config: multiple_choice |
| | split: validation |
| | args: |
| | num_few_shot: 0 |
| | metrics: |
| | - type: mc2 |
| | value: 42.13 |
| | source: |
| | url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=AlpinDale/pygmalion-instruct |
| | name: Open LLM Leaderboard |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: Winogrande (5-shot) |
| | type: winogrande |
| | config: winogrande_xl |
| | split: validation |
| | args: |
| | num_few_shot: 5 |
| | metrics: |
| | - type: acc |
| | value: 72.06 |
| | name: accuracy |
| | source: |
| | url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=AlpinDale/pygmalion-instruct |
| | name: Open LLM Leaderboard |
| | - task: |
| | type: text-generation |
| | name: Text Generation |
| | dataset: |
| | name: GSM8k (5-shot) |
| | type: gsm8k |
| | config: main |
| | split: test |
| | args: |
| | num_few_shot: 5 |
| | metrics: |
| | - type: acc |
| | value: 9.86 |
| | name: accuracy |
| | source: |
| | url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=AlpinDale/pygmalion-instruct |
| | name: Open LLM Leaderboard |
| | --- |
| | |
| | ## Model Details |
| |
|
| | Experimental model. Trained with the [Pygmalion](https://huggingface.co/PygmalionAI/pygmalion-6b/tree/dev) and the [WizardLM](https://huggingface.co/ehartford/WizardLM-7B-Uncensored) datasets. |
| |
|
| | The purpose of this model is to enable complex Instruct prompting but with the RP capabilties of Pygmalion. |
| |
|
| | ### Prompting format |
| | ``` |
| | instruction: |
| | output: |
| | ``` |
| |
|
| | <!-- Provide the basic links for the model. --> |
| |
|
| | - **Repository:** [More Information Needed] |
| | - **Paper [optional]:** [More Information Needed] |
| | - **Demo [optional]:** [More Information Needed] |
| |
|
| | ### Uses |
| |
|
| | The intended use-case is Role-Playing with Instruct prompts. Guiding the bot towards a certain conversation style should be easier this way. Subject to experimentation. |
| |
|
| |
|
| |
|
| | ### Out-of-Scope Use |
| |
|
| | - Assistant Bot [subject to providing incorrect instructions] |
| | - Complex multi-character chat |
| |
|
| | ### Risks |
| |
|
| | The model can generate potentially harmful or NSFW outputs. Please use with caution. |
| |
|
| | ### Citation |
| |
|
| | WizardLM: |
| | ``` |
| | @misc{xu2023wizardlm, |
| | title={WizardLM: Empowering Large Language Models to Follow Complex Instructions}, |
| | author={Can Xu and Qingfeng Sun and Kai Zheng and Xiubo Geng and Pu Zhao and Jiazhan Feng and Chongyang Tao and Daxin Jiang}, |
| | year={2023}, |
| | eprint={2304.12244}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL} |
| | } |
| | ``` |
| |
|
| | # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) |
| | Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_AlpinDale__pygmalion-instruct) |
| |
|
| | | Metric |Value| |
| | |---------------------------------|----:| |
| | |Avg. |48.37| |
| | |AI2 Reasoning Challenge (25-Shot)|52.56| |
| | |HellaSwag (10-Shot) |77.65| |
| | |MMLU (5-Shot) |35.94| |
| | |TruthfulQA (0-shot) |42.13| |
| | |Winogrande (5-shot) |72.06| |
| | |GSM8k (5-shot) | 9.86| |
| |
|
| |
|