In a Training Loop 🔄

3 25

Kanishkha Jaisankar

jkanishkha0305

https://jkanishkha.com

AI & ML interests

Recent Activity

upvoted an article about 16 hours ago

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

upvoted a paper 1 day ago

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

upvoted a paper about 1 month ago

Don't Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG

View all activity

Organizations

upvoted an article about 16 hours ago

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

NormalUhr

•

Feb 11, 2025

• 125

upvoted a paper 1 day ago

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Paper • 2605.23904 • Published 9 days ago • 204

upvoted a paper about 1 month ago

Don't Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG

Paper • 2604.14572 • Published Apr 16 • 7

updated a model 2 months ago

jkanishkha0305/checkpoints

Updated Mar 26

published a model 2 months ago

jkanishkha0305/checkpoints

Updated Mar 26

liked a model 6 months ago

JeethuSri/archer-exports

Updated Nov 13, 2025 • 1

liked a model 8 months ago

openai-community/gpt2

Text Generation • 0.1B • Updated Feb 19, 2024 • 16.3M • 3.27k

updated 2 models 8 months ago

jkanishkha0305/gemma3_270m_orpo

0.3B • Updated Sep 25, 2025 • 2

jkanishkha0305/gemma3_270m_dpo

0.3B • Updated Sep 25, 2025 • 2

published 2 models 8 months ago

jkanishkha0305/gemma3_270m_orpo

0.3B • Updated Sep 25, 2025 • 2

jkanishkha0305/gemma3_270m_dpo

0.3B • Updated Sep 25, 2025 • 2

published 2 models 9 months ago

jkanishkha0305/gemma3-2b-mental-mix-sft

Updated Sep 16, 2025

jkanishkha0305/gemma-270m-fullfinetuned

Updated Sep 1, 2025

updated a model 9 months ago

jkanishkha0305/gemma3_270m_sft_qlora

Updated Aug 22, 2025

published a model 9 months ago

jkanishkha0305/gemma3_270m_sft_qlora

Updated Aug 22, 2025

liked 4 models 9 months ago

liked a dataset 9 months ago

ShenLab/MentalChat16K

Viewer • Updated Jul 14, 2025 • 16.1k • 1.04k • 56

Kanishkha Jaisankar

AI & ML interests

Recent Activity

Organizations

jkanishkha0305's activity

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment