Anirudh Buvanesh's picture

Anirudh Buvanesh

anirudhb11

·

anirudhb11

AI & ML interests

None yet

Recent Activity

updated a dataset about 1 month ago

anirudhb11/rebase_vgs_gpt-oss-20b_rg_games_ns128_md4_bt0_1_seed42_rg_games_vgs_2gpu

updated a dataset about 1 month ago

anirudhb11/rebase_vgs_gpt-oss-20b_rg_games_ns32_md4_bt0_1_seed42_rg_games_vgs_2gpu

updated a dataset about 1 month ago

anirudhb11/rebase_vgs_gpt-oss-20b_rg_games_ns8_md4_bt0_1_seed42_rg_games_vgs_2gpu

View all activity

Organizations

None yet

anirudhb11 's models 396

anirudhb11/r1d-1.5b_dapo_subset_2k_fin_actor

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_dapo_subset_5k_fin_critic

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_dapo_subset_5k_fin_actor

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_dapo_subset_10k_fin_critic

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_dapo_subset_10k_fin_actor

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_dapo_fin_critic

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_dapo_fin_actor

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_deepscaler_subset_2k_fin_critic

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_deepscaler_subset_2k_fin_actor

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_deepscaler_subset_5k_fin_critic

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_deepscaler_subset_5k_fin_actor

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_deepscaler_subset_10k_fin_critic

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_deepscaler_subset_10k_fin_actor

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_deepscaler_subset_20k_fin_critic

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_deepscaler_subset_20k_fin_actor

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_deepscaler_fin_critic

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_deepscaler_fin_actor

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_deepscaler_repeated_fin_critic

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_deepscaler_repeated_fin_actor

Updated Oct 29, 2025

anirudhb11/r1d-1.5b_deepscaler_longcot_8k_ppo_dapo_DeepSeek-R1-Distill-Qwen-1.5B_critic

Updated Oct 26, 2025

anirudhb11/r1d-1.5b_deepscaler_longcot_8k_ppo_dapo_DeepSeek-R1-Distill-Qwen-1.5B_actor

Updated Oct 26, 2025

anirudhb11/r1d-1.5b_deepscaler_longcot_8k_ppo_dapo_DeepSeek-R1-Distill-Qwen-1.5B_subset_2000_r3_critic

Updated Oct 26, 2025

anirudhb11/r1d-1.5b_deepscaler_longcot_8k_ppo_dapo_DeepSeek-R1-Distill-Qwen-1.5B_subset_2000_r3_actor

Updated Oct 26, 2025

anirudhb11/r1d-1.5b_deepscaler_longcot_8k_ppo_hendrycks_math_DeepSeek-R1-Distill-Qwen-1.5B_critic

Updated Oct 26, 2025

anirudhb11/r1d-1.5b_deepscaler_longcot_8k_ppo_hendrycks_math_DeepSeek-R1-Distill-Qwen-1.5B_actor

Updated Oct 26, 2025

anirudhb11/critic_200_ppo-run-math-training-prompt-len-800-response-len-4096-r3-actor-low-lr-0-ae0cd033d2

Text Classification • 2B • Updated Oct 17, 2025 • 2

anirudhb11/actor_200_ppo-run-math-training-prompt-len-800-response-len-4096-r3-actor-low-lr-0-76f5638c9d

Text Generation • 2B • Updated Oct 17, 2025 • 1

anirudhb11/critic_400_ppo-run-math-training-prompt-len-800-response-len-4096-r3-actor-low-lr-0-4bac9e133e

Text Classification • 2B • Updated Oct 17, 2025 • 2

anirudhb11/actor_400_ppo-run-math-training-prompt-len-800-response-len-4096-r3-actor-low-lr-0-f767916602

Text Generation • 2B • Updated Oct 17, 2025 • 1

anirudhb11/critic_600_ppo-run-math-training-prompt-len-800-response-len-4096-r3-actor-low-lr-0-c0842d8e93

Text Classification • 2B • Updated Oct 17, 2025 • 2