giraffe-7b / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
923ea5f
|
raw
history blame
1.02 kB
metadata
datasets:
  - ashercn97/OpenOrcaPleaseWork
  - ashercn97/RenamedCodeEvol
language:
  - en
library_name: transformers

This is a model fine tuned on an orca dataset and a multi language code dataset. It is one of my first projects, but I am happy with how it works. It took about 6 hours on 2 RTX 4090 GPUs. I used Axolotl to train this (HI IF YOURE SEEING THIS!!!).

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 39.73
ARC (25-shot) 47.18
HellaSwag (10-shot) 75.53
MMLU (5-shot) 38.89
TruthfulQA (0-shot) 38.48
Winogrande (5-shot) 68.98
GSM8K (5-shot) 2.65
DROP (3-shot) 6.39