Kishan Panaganti

kishanpb

1 10 3

https://sites.google.com/a/tamu.edu/kpb/home

AI & ML interests

LLM Reasoning via RL and anything RL

Recent Activity

upvoted a paper 27 days ago

TRON: Targeted Rule-Verifiable Online Environments for Visual Reasoning RL

upvoted a paper about 2 months ago

Learning to Build the Environment: Self-Evolving Reasoning RL via Verifiable Environment Synthesis

updated a dataset 2 months ago

kishanpb/halegannada-hosakannada

View all activity

Organizations

None yet

upvoted a paper 27 days ago

TRON: Targeted Rule-Verifiable Online Environments for Visual Reasoning RL

Paper • 2606.01599 • Published 29 days ago • 17

upvoted a paper about 2 months ago

Learning to Build the Environment: Self-Evolving Reasoning RL via Verifiable Environment Synthesis

Paper • 2605.14392 • Published May 14 • 9

upvoted a collection 4 months ago

Penguin-VL

Collection

7 items • Updated Apr 22 • 14

upvoted a paper 5 months ago

Group Distributionally Robust Optimization-Driven Reinforcement Learning for LLM Reasoning

Paper • 2601.19280 • Published Jan 27 • 9

upvoted 2 papers 8 months ago

TARDIS STRIDE: A Spatio-Temporal Road Image Dataset for Exploration and Autonomy

Paper • 2506.11302 • Published Jun 12, 2025 • 3

Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values

Paper • 2510.20187 • Published Oct 23, 2025 • 20

upvoted 2 papers 9 months ago

CLUE: Non-parametric Verification from Experience via Hidden-State Clustering

Paper • 2510.01591 • Published Oct 2, 2025 • 28

Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation

Paper • 2509.15194 • Published Sep 18, 2025 • 33

upvoted an article 12 months ago

Article

<p style="text-align:center;"> Bourbaki (7b): SOTA 7B Algorithms for Putnam Bench (Part I: Reasoning MDPs)</p>

hba123

•

Jul 13, 2025

• 11

upvoted a collection 12 months ago

Reward Models 06-2025