Improving Data and Reward Design for Scientific Reasoning in Large Language Models Paper • 2602.08321 • Published 2 days ago • 35
MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration Paper • 2602.01734 • Published 9 days ago • 32
cheesewafer/web-alf-sci_15106_241696_prm_data_train-llama3-8b-sft_16_mse Feature Extraction • 8B • Updated Jun 4, 2025
cheesewafer/web-alf-sci_15106_241696_prm_data_train-llama3-8b-sft_16_mse Feature Extraction • 8B • Updated Jun 4, 2025
cheesewafer/Meta-Llama-3.2-3B-Instruct-regression-avg-all-v3-True-False Feature Extraction • 3B • Updated Jun 3, 2025
cheesewafer/Meta-Llama-3.2-3B-Instruct-regression-avg-all-v3-True-False Feature Extraction • 3B • Updated Jun 3, 2025
cheesewafer/Meta-Llama-3.2-1B-Instruct-regression-avg-all-v3-True-False Feature Extraction • 1B • Updated Jun 3, 2025
cheesewafer/Meta-Llama-3.2-1B-Instruct-regression-avg-all-v3-True-False Feature Extraction • 1B • Updated Jun 3, 2025
cheesewafer/Meta-Llama-3.1-8B-Instruct-regression-avg-all-v3-True-False Feature Extraction • 8B • Updated Jun 3, 2025
cheesewafer/Meta-Llama-3.1-8B-Instruct-regression-avg-all-v3-True-False Feature Extraction • 8B • Updated Jun 3, 2025
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published Mar 31, 2025 • 303