Datasets cais/mmlu Viewer • Updated Mar 8, 2024 • 231k • 486k • 720 nvidia/OpenMathInstruct-1 Viewer • Updated Feb 16, 2024 • 6.08M • 2.09k • 252 microsoft/orca-math-word-problems-200k Viewer • Updated Mar 4, 2024 • 200k • 10.8k • 479 meta-math/MetaMathQA Viewer • Updated Dec 21, 2023 • 395k • 60.2k • 457
Papers GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 190
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 190
Math microsoft/orca-math-word-problems-200k Viewer • Updated Mar 4, 2024 • 200k • 10.8k • 479 cais/mmlu Viewer • Updated Mar 8, 2024 • 231k • 486k • 720 nvidia/OpenMathInstruct-1 Viewer • Updated Feb 16, 2024 • 6.08M • 2.09k • 252 meta-math/MetaMathQA Viewer • Updated Dec 21, 2023 • 395k • 60.2k • 457
Datasets cais/mmlu Viewer • Updated Mar 8, 2024 • 231k • 486k • 720 nvidia/OpenMathInstruct-1 Viewer • Updated Feb 16, 2024 • 6.08M • 2.09k • 252 microsoft/orca-math-word-problems-200k Viewer • Updated Mar 4, 2024 • 200k • 10.8k • 479 meta-math/MetaMathQA Viewer • Updated Dec 21, 2023 • 395k • 60.2k • 457
Math microsoft/orca-math-word-problems-200k Viewer • Updated Mar 4, 2024 • 200k • 10.8k • 479 cais/mmlu Viewer • Updated Mar 8, 2024 • 231k • 486k • 720 nvidia/OpenMathInstruct-1 Viewer • Updated Feb 16, 2024 • 6.08M • 2.09k • 252 meta-math/MetaMathQA Viewer • Updated Dec 21, 2023 • 395k • 60.2k • 457
Papers GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 190
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 190