TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_r1-countdown_4arg-eval_0
Viewer
• Updated
• 1k • 7
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_basemodel-commonsenseQA__v1
Viewer
• Updated
• 1.23k • 10
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_basemodel-commonsenseQA-eval_0
Viewer
• Updated
• 1.22k • 6
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_basemodel-countdown_6arg__v1
Viewer
• Updated
• 1k • 7
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_basemodel-countdown_6arg-eval_0
Viewer
• Updated
• 1k • 5
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_ours_sft-countdown_3arg__v1
Viewer
• Updated
• 1k • 10
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_ours_sft-countdown_3arg-eval_sft
Viewer
• Updated
• 1k • 6
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_basemodel-countdown_5arg__v1
Viewer
• Updated
• 1k • 8
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_basemodel-countdown_5arg-eval_0
Viewer
• Updated
• 1k • 7
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_basemodel-countdown_4arg__v1
Viewer
• Updated
• 1k • 8
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_basemodel-countdown_4arg-eval_0
Viewer
• Updated
• 1k • 6
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_star-countdown_3arg__v1
Viewer
• Updated
• 1k • 9
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_star-countdown_3arg-eval_rl
Viewer
• Updated
• 1k • 7
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_basemodel-countdown_3arg__v1
Viewer
• Updated
• 1k • 9
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_basemodel-countdown_3arg-eval_0
Viewer
• Updated
• 1k • 7
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_r1-countdown_3arg__v1
Viewer
• Updated
• 1.01k • 8
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_r1-countdown_3arg-eval_0
Viewer
• Updated
• 1k • 5
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_bolt-countdown_3arg__v1
Viewer
• Updated
• 1k • 9
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_bolt-countdown_3arg-eval_rl
Viewer
• Updated
• 1k • 5
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval__3args_rlonly-eval_rl
Viewer
• Updated
• 12.5k • 10
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval__3args_ours-eval_rl
Viewer
• Updated
• 12.5k • 9
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_ours-letter_countdown_4o__v1
Viewer
• Updated
• 304 • 8
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_ours-letter_countdown_4o-eval_rl
Viewer
• Updated
• 300 • 7
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_ours-letter_countdown_5o__v1
Viewer
• Updated
• 304 • 8
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_ours-letter_countdown_5o-eval_rl
Viewer
• Updated
• 300 • 6
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_ours-acronym_4o__v1
Viewer
• Updated
• 201 • 10
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_ours-acronym_4o-eval_rl
Viewer
• Updated
• 197 • 6
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_ours-acronym_5o__v1
Viewer
• Updated
• 148 • 7
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_ours-acronym_5o-eval_rl
Viewer
• Updated
• 144 • 6
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_ours-longmult_5dig__v1
Viewer
• Updated
• 1k • 7