Spaces:
Build error
Build error
Commit History
mistral 10-shot 33cd694
rtx4090 0-shot d028752
ready for few shots eval cf912f1
claude 0-shot 397a2fa
added original data from MGTV challenge 5f9686b
https://github.com/mazzzystar/TurtleBenchmark 444a581
compare o1 vs gpt-4o 4cd13da
o1-mini analyzed f1b0a53
o1-mini results fd14581
LogiQA2.0 dataset bf13772
openai batch 921fa92
Create 04e_OpenAI_comparison.ipynb 2bb5512
internlm_v2 results 83818dc
internlm2_5-7b-chat fine-tune results e4bce5e
added scripts/eval-mgtv-internlm_v2.sh 71dcee7
Update 04_Few-shot_Prompting_OpenAI.ipynb 8e678e8
ready for fine-tuning internlm2_5-20b-chat 62c2b84
saved best results/metrics 573f5d1
completed eval/analysis 468b88d
qwen2-72b full results 6e932d8
openai zero-shot results 8b9bb19
Update eval_logical_reasoning_all_epochs.py 090acf8
change BATCH_SIZE to 1 for qwen2-72b eval 4c31851
open source LLM results almost done 5a8f8d2
llama3.1-70b done 5dc41da
mistral updated a9f4f1f
llama-3.1-70b wip 60dc2c4
llama-3.1-70b wip 717ab95
mistral wip 9129c41
llama3.1-70b wip e5b5f58
Update llm_utils.py 71af822
mistral complete 1e26971
llama3.1-70b wip ff0dc02
mistral wip 2f6ccd3
llama-3.1-8b results fa0492a
Update eval-mgtv-qwen2_72b.sh 0b58370
clean up 629867b
done fine-tuning 72043da
ready for eval 473e849
qwen2 72b 80% results 62df289
tuning mistral cn bbea107
ready for bf16 tuning e656f92
qwen2 72b checkpoints 9dc50d8
qwen2 72b eval results 7d7eda5
Create eval-mgtv-qwen2_70b.sh 385e4f4
fix training configs 64655af
ready for QLoRA b87bc1a
Update 16_Submissions.ipynb 5066565
Hu Haotian commited on