MayaPH
/

GodziLLa2-70B

Text Generation

text-generation-inference

Model card Files Files and versions

jaspercatapang commited on Aug 17, 2023

Commit

e0c9db3

·

1 Parent(s): 9711067

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -41,6 +41,7 @@ According to the leaderboard description, here are the benchmarks used for the e
 *Based on a [leaderboard clone](https://huggingface.co/spaces/gsaivinay/open_llm_leaderboard) with GPT-3.5 and GPT-4 included.
 ### Reproducing Evaluation Results
 Install LM Evaluation Harness:
 ```
@@ -53,26 +54,25 @@ git checkout b281b0921b636bc36ad05c0b0b0763bd6dd43463
 # install
 pip install -e .
 ```
-Each task was evaluated on a single A100 80GB GPU.
 ARC:
 ```
-python main.py --model hf-causal-experimental --model_args pretrained=garage-bAInd/Platypus2-70B-instruct --tasks arc_challenge --batch_size 1 --no_cache --write_out --output_path results/Platypus2-70B-instruct/arc_challenge_25shot.json --device cuda --num_fewshot 25
 ```
 HellaSwag:
 ```
-python main.py --model hf-causal-experimental --model_args pretrained=garage-bAInd/Platypus2-70B-instruct --tasks hellaswag --batch_size 1 --no_cache --write_out --output_path results/Platypus2-70B-instruct/hellaswag_10shot.json --device cuda --num_fewshot 10
 ```
 MMLU:
 ```
-python main.py --model hf-causal-experimental --model_args pretrained=garage-bAInd/Platypus2-70B-instruct --tasks hendrycksTest-* --batch_size 1 --no_cache --write_out --output_path results/Platypus2-70B-instruct/mmlu_5shot.json --device cuda --num_fewshot 5
 ```
 TruthfulQA:
 ```
-python main.py --model hf-causal-experimental --model_args pretrained=garage-bAInd/Platypus2-70B-instruct --tasks truthfulqa_mc --batch_size 1 --no_cache --write_out --output_path results/Platypus2-70B-instruct/truthfulqa_0shot.json --device cuda
 ```
 ### Prompt Template

 *Based on a [leaderboard clone](https://huggingface.co/spaces/gsaivinay/open_llm_leaderboard) with GPT-3.5 and GPT-4 included.
 ### Reproducing Evaluation Results
+*Instruction template taken from [Platypus 2 70B instruct](https://huggingface.co/garage-bAInd/Platypus2-70B-instruct).
 Install LM Evaluation Harness:
 ```
 # install
 pip install -e .
 ```
 ARC:
 ```
+python main.py --model hf-causal-experimental --model_args pretrained=MayaPH/GodziLLa2-70B --tasks arc_challenge --batch_size 1 --no_cache --write_out --output_path results/G270B/arc_challenge_25shot.json --device cuda --num_fewshot 25
 ```
 HellaSwag:
 ```
+python main.py --model hf-causal-experimental --model_args pretrained=MayaPH/GodziLLa2-70B --tasks hellaswag --batch_size 1 --no_cache --write_out --output_path results/G270B/hellaswag_10shot.json --device cuda --num_fewshot 10
 ```
 MMLU:
 ```
+python main.py --model hf-causal-experimental --model_args pretrained=MayaPH/GodziLLa2-70B --tasks hendrycksTest-* --batch_size 1 --no_cache --write_out --output_path results/G270B/mmlu_5shot.json --device cuda --num_fewshot 5
 ```
 TruthfulQA:
 ```
+python main.py --model hf-causal-experimental --model_args pretrained=MayaPH/GodziLLa2-70B --tasks truthfulqa_mc --batch_size 1 --no_cache --write_out --output_path results/G270B/truthfulqa_0shot.json --device cuda
 ```
 ### Prompt Template