Adding Evaluation Results

923ea5f over 2 years ago

1.02 kB

datasets:
  - ashercn97/OpenOrcaPleaseWork
  - ashercn97/RenamedCodeEvol
language:
  - en
library_name: transformers

This is a model fine tuned on an orca dataset and a multi language code dataset. It is one of my first projects, but I am happy with how it works. It took about 6 hours on 2 RTX 4090 GPUs. I used Axolotl to train this (HI IF YOURE SEEING THIS!!!).

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	39.73
ARC (25-shot)	47.18
HellaSwag (10-shot)	75.53
MMLU (5-shot)	38.89
TruthfulQA (0-shot)	38.48
Winogrande (5-shot)	68.98
GSM8K (5-shot)	2.65
DROP (3-shot)	6.39