2 5

Safal Shrestha

safal312

safal312

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Training Reasoning Models on Saturated Problems via Failure-Prefix Conditioning

updated a model 7 months ago

safal312/llama8b-kk

published a model 7 months ago

safal312/llama8b-kk

View all activity

Organizations

None yet

upvoted a paper 2 days ago

Training Reasoning Models on Saturated Problems via Failure-Prefix Conditioning

Paper • 2601.20829 • Published 4 days ago • 5

updated a model 7 months ago

safal312/llama8b-kk

8B • Updated Jun 30, 2025

published a model 7 months ago

safal312/llama8b-kk

8B • Updated Jun 30, 2025

updated a model 7 months ago

safal312/olmo1b

1B • Updated Jun 28, 2025

published a model 7 months ago

safal312/olmo1b

1B • Updated Jun 28, 2025

updated a model 7 months ago

safal312/qwen-1000

3B • Updated Jun 28, 2025

published a model 7 months ago

safal312/qwen-1000

3B • Updated Jun 28, 2025

updated a model 7 months ago

safal312/llama-sftkk

3B • Updated Jun 27, 2025

published a model 7 months ago

safal312/llama-sftkk

3B • Updated Jun 27, 2025

updated a model 7 months ago

safal312/llama-numina-sftkk

3B • Updated Jun 27, 2025

published a model 7 months ago

safal312/llama-numina-sftkk

3B • Updated Jun 27, 2025

updated a model 7 months ago

safal312/olmo-sft-7b

7B • Updated Jun 27, 2025 • 1

published a model 7 months ago

safal312/olmo-sft-7b

7B • Updated Jun 27, 2025 • 1

updated a model 7 months ago

safal312/olmo-sft

1B • Updated Jun 27, 2025

published a model 7 months ago

safal312/olmo-sft

1B • Updated Jun 27, 2025

New activity in safal312/knights_and_knaves_reasoning 8 months ago

Update task category, add Github link and project page

#2 opened 8 months ago by

nielsr

updated a dataset 8 months ago

safal312/knights_and_knaves_reasoning

Viewer • Updated Jun 8, 2025 • 4.47k • 37

authored a paper 9 months ago

Reinforcement Learning vs. Distillation: Understanding Accuracy and Capability in LLM Reasoning

Paper • 2505.14216 • Published May 20, 2025 • 2

upvoted a paper 9 months ago

Reinforcement Learning vs. Distillation: Understanding Accuracy and Capability in LLM Reasoning

Paper • 2505.14216 • Published May 20, 2025 • 2

authored a paper 9 months ago

Warm Up Before You Train: Unlocking General Reasoning in Resource-Constrained Settings

Paper • 2505.13718 • Published May 19, 2025 • 7

Safal Shrestha

AI & ML interests

Recent Activity

Organizations

safal312's activity

Update task category, add Github link and project page