Yu Xia's picture

3

Yu Xia

cheesewafer

AI & ML interests

None yet

Recent Activity

upvoted a paper about 13 hours ago

Improving Data and Reward Design for Scientific Reasoning in Large Language Models

upvoted a paper 2 days ago

MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

updated a model 8 months ago

cheesewafer/web-alf-sci_15106_241696_prm_data_train-llama3-8b-sft_16_mse

View all activity

Organizations

None yet

upvoted a paper about 13 hours ago

Improving Data and Reward Design for Scientific Reasoning in Large Language Models

Paper • 2602.08321 • Published 2 days ago • 35

upvoted a paper 2 days ago

MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

Paper • 2602.01734 • Published 9 days ago • 32

updated a model 8 months ago

cheesewafer/web-alf-sci_15106_241696_prm_data_train-llama3-8b-sft_16_mse

Feature Extraction • 8B • Updated Jun 4, 2025

published a model 8 months ago

cheesewafer/web-alf-sci_15106_241696_prm_data_train-llama3-8b-sft_16_mse

Feature Extraction • 8B • Updated Jun 4, 2025

updated a model 8 months ago

cheesewafer/Llama3-8B-Instruct-sft-sciworld

Feature Extraction • 8B • Updated Jun 3, 2025

published a model 8 months ago

cheesewafer/Llama3-8B-Instruct-sft-sciworld

Feature Extraction • 8B • Updated Jun 3, 2025

updated a model 8 months ago

cheesewafer/Llama3-8B-Instruct-sft-alfworld

Feature Extraction • 8B • Updated Jun 3, 2025

published a model 8 months ago

cheesewafer/Llama3-8B-Instruct-sft-alfworld

Feature Extraction • 8B • Updated Jun 3, 2025

updated a model 8 months ago

cheesewafer/Llama3-8B-Instruct-sft-webshop

Feature Extraction • 8B • Updated Jun 3, 2025

published a model 8 months ago

cheesewafer/Llama3-8B-Instruct-sft-webshop

Feature Extraction • 8B • Updated Jun 3, 2025

updated a model 8 months ago

cheesewafer/Meta-Llama-3.2-3B-Instruct-regression-avg-all-v3-True-False

Feature Extraction • 3B • Updated Jun 3, 2025

published a model 8 months ago

cheesewafer/Meta-Llama-3.2-3B-Instruct-regression-avg-all-v3-True-False

Feature Extraction • 3B • Updated Jun 3, 2025

updated a model 8 months ago

cheesewafer/Meta-Llama-3.2-1B-Instruct-regression-avg-all-v3-True-False

Feature Extraction • 1B • Updated Jun 3, 2025

published a model 8 months ago

cheesewafer/Meta-Llama-3.2-1B-Instruct-regression-avg-all-v3-True-False

Feature Extraction • 1B • Updated Jun 3, 2025

updated a model 8 months ago

cheesewafer/Meta-Llama-3.1-8B-Instruct-regression-avg-all-v3-True-False

Feature Extraction • 8B • Updated Jun 3, 2025

published a model 9 months ago

cheesewafer/Meta-Llama-3.1-8B-Instruct-regression-avg-all-v3-True-False

Feature Extraction • 8B • Updated Jun 3, 2025

upvoted a paper 10 months ago

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published Mar 31, 2025 • 303