agentlans
/

Qwen2.5-0.5B-Instruct-CrashCourse-dropout

Safetensors

qwen2

Eval Results (legacy)

Model card Files Files and versions

xet

Community

agentlans commited on Jan 3, 2025

Commit

ed55cb4

verified ·

1 Parent(s): 465aad3

Update README.md

Browse files

Files changed (1) hide show

README.md +17 -12

README.md CHANGED Viewed

@@ -107,10 +107,14 @@ model-index:
 ---
 # Qwen2.5-0.5B-Instruct-CrashCourse-dropout
-## Model Description
-This model is a fine-tuned version of Qwen/Qwen2.5-0.5B-Instruct, specifically adapted for enhanced performance on instructional and multitask scenarios.
-It leverages two datasets: "agentlans/crash-course" and "vicgalle/configurable-system-prompt-multitask" to improve its capabilities in handling diverse tasks and responding to various instruction formats.
 ## Intended Use
@@ -143,13 +147,14 @@ For more details on the base model, please refer to the Qwen/Qwen2.5-0.5B-Instru
 Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/agentlans__Qwen2.5-0.5B-Instruct-CrashCourse-dropout-details)!
 Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=agentlans%2FQwen2.5-0.5B-Instruct-CrashCourse-dropout&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!
-|      Metric       |Value (%)|
-|-------------------|--------:|
-|**Average**        |     7.74|
-|IFEval (0-Shot)    |    29.49|
-|BBH (3-Shot)       |     7.23|
-|MATH Lvl 5 (4-Shot)|     0.08|
-|GPQA (0-shot)      |     1.79|
-|MuSR (0-shot)      |     1.11|
-|MMLU-PRO (5-shot)  |     6.76|

 ---
 # Qwen2.5-0.5B-Instruct-CrashCourse-dropout
+This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct),
+specifically adapted for enhanced performance on instructional and multitask scenarios.
+It leverages two datasets: [agentlans/crash-course](https://huggingface.co/datasets/agentlans/crash-course) and
+[vicgalle/configurable-system-prompt-multitask](https://huggingface.co/datasets/vicgalle/configurable-system-prompt-multitask)
+to improve its capabilities in handling diverse tasks and responding to various instruction formats.
+> [!NOTE]
+> **Update:** Despite the poor benchmark, the model seems OK at slightly complex prompts. There's more finetuning potential here.
 ## Intended Use
 Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/agentlans__Qwen2.5-0.5B-Instruct-CrashCourse-dropout-details)!
 Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=agentlans%2FQwen2.5-0.5B-Instruct-CrashCourse-dropout&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!
+|      Metric       | Qwen2.5-0.5B-Instruct-CrashCourse-dropout | Qwen2.5-0.5B-Instruct |
+|-------------------|-----------------------------------------:|----------------------:|
+| **Average**       |                               7.74 %     |                8.38 %  |
+| IFEval (0-Shot)   |                              29.49 %     |               31.53 %  |
+| BBH (3-Shot)      |                               7.23 %     |                8.17 %  |
+| MATH Lvl 5 (4-Shot)|                               0.08 %     |                0.00 %  |
+| GPQA (0-shot)     |                               1.79 %     |                1.23 %  |
+| MuSR (0-shot)     |                               1.11 %     |                1.37 %  |
+| MMLU-PRO (5-shot) |                               6.76 %     |                8.00 %  |