Kishan Panaganti

kishanpb

1 10 3

https://sites.google.com/a/tamu.edu/kpb/home

AI & ML interests

LLM Reasoning via RL and anything RL

Recent Activity

upvoted a paper 27 days ago

TRON: Targeted Rule-Verifiable Online Environments for Visual Reasoning RL

upvoted a paper about 2 months ago

Learning to Build the Environment: Self-Evolving Reasoning RL via Verifiable Environment Synthesis

updated a dataset 2 months ago

kishanpb/halegannada-hosakannada

View all activity

Organizations

None yet

upvoted a paper 27 days ago

TRON: Targeted Rule-Verifiable Online Environments for Visual Reasoning RL

Paper • 2606.01599 • Published 30 days ago • 17

upvoted a paper about 2 months ago

Learning to Build the Environment: Self-Evolving Reasoning RL via Verifiable Environment Synthesis

Paper • 2605.14392 • Published May 14 • 9

updated a dataset 2 months ago

kishanpb/halegannada-hosakannada

Viewer • Updated May 1 • 97.6k • 584

published a dataset 2 months ago

kishanpb/halegannada-hosakannada

Viewer • Updated May 1 • 97.6k • 584

liked a Space 2 months ago

Hy3-preview

⚡

Hy3-preview multi-turn streaming chat with function calling

upvoted a collection 4 months ago

Penguin-VL

Collection

7 items • Updated Apr 22 • 14

upvoted a paper 5 months ago

Group Distributionally Robust Optimization-Driven Reinforcement Learning for LLM Reasoning

Paper • 2601.19280 • Published Jan 27 • 9

authored a paper 5 months ago

Group Distributionally Robust Optimization-Driven Reinforcement Learning for LLM Reasoning

Paper • 2601.19280 • Published Jan 27 • 9

submitted a paper to Daily Papers 5 months ago

Group Distributionally Robust Optimization-Driven Reinforcement Learning for LLM Reasoning

Paper • 2601.19280 • Published Jan 27 • 9

authored 2 papers 6 months ago

Guided Self-Evolving LLMs with Minimal Human Supervision

Paper • 2512.02472 • Published Dec 2, 2025 • 55

Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning

Paper • 2512.15687 • Published Dec 17, 2025 • 22

authored a paper 8 months ago

Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values

Paper • 2510.20187 • Published Oct 23, 2025 • 20

upvoted 2 papers 8 months ago

TARDIS STRIDE: A Spatio-Temporal Road Image Dataset for Exploration and Autonomy

Paper • 2506.11302 • Published Jun 12, 2025 • 3

Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values

Paper • 2510.20187 • Published Oct 23, 2025 • 20

upvoted a paper 9 months ago

CLUE: Non-parametric Verification from Experience via Hidden-State Clustering

Paper • 2510.01591 • Published Oct 2, 2025 • 28

authored 2 papers 9 months ago

TARDIS STRIDE: A Spatio-Temporal Road Image Dataset for Exploration and Autonomy

Paper • 2506.11302 • Published Jun 12, 2025 • 3

Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation

Paper • 2509.15194 • Published Sep 18, 2025 • 33

upvoted a paper 9 months ago

Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation

Paper • 2509.15194 • Published Sep 18, 2025 • 33

upvoted an article 12 months ago

Article

<p style="text-align:center;"> Bourbaki (7b): SOTA 7B Algorithms for Putnam Bench (Part I: Reasoning MDPs)</p>

hba123

•

Jul 13, 2025

• 11

upvoted a collection 12 months ago

Reward Models 06-2025

Collection

Nemotron reward models. For use in RLHF pipelines and LLM-as-a-Judge • 8 items • Updated 19 days ago • 24

Kishan Panaganti

AI & ML interests

Recent Activity

Organizations

kishanpb's activity

Hy3-preview

<p style="text-align:center;"> Bourbaki (7b): SOTA 7B Algorithms for Putnam Bench (Part I: Reasoning MDPs)</p>