jp1924

jp1924

AI & ML interests

Audio, Image, Text

Recent Activity

upvoted a paper 23 days ago

Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

upvoted a paper 26 days ago

Self-Improving Language Models with Bidirectional Evolutionary Search

new activity 27 days ago

naver-hyperclovax/HyperCLOVAX-SEED-Think-32B:Update chat_template.jinja

View all activity

Organizations

upvoted a paper 23 days ago

Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

Paper • 2509.02522 • Published Sep 2, 2025 • 26

upvoted a paper 26 days ago

Self-Improving Language Models with Bidirectional Evolutionary Search

Paper • 2605.28814 • Published 29 days ago • 60

New activity in naver-hyperclovax/HyperCLOVAX-SEED-Think-32B 27 days ago

Update chat_template.jinja

#12 opened 3 months ago by

jp1924

upvoted a paper 28 days ago

DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning

Paper • 2605.25604 • Published about 1 month ago • 138

upvoted a paper 30 days ago

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Paper • 2605.23904 • Published May 22 • 246

upvoted 2 papers about 1 month ago

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

Paper • 2605.11609 • Published May 12 • 196

Process Rewards with Learned Reliability

Paper • 2605.15529 • Published May 15 • 53

liked a dataset about 1 month ago

TeichAI/DeepSeek-v4-Pro-Agent

Traces • Updated May 22 • 4.01k • 7.69k • 83

upvoted a paper about 2 months ago

RLPR: Extrapolating RLVR to General Domains without Verifiers

Paper • 2506.18254 • Published Jun 23, 2025 • 35

liked 2 datasets 2 months ago

nvidia/Nemotron-Personas-Korea

Viewer • Updated 2 days ago • 1M • 12.6k • 505

allenai/RLVR-IFeval

Viewer • Updated Nov 21, 2024 • 15k • 907 • 32

upvoted 2 papers 2 months ago

Reinforcement-aware Knowledge Distillation for LLM Reasoning

Paper • 2602.22495 • Published Feb 26 • 6

Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning

Paper • 2602.01058 • Published Feb 1 • 45

liked a Space 2 months ago

LLM Embeddings Explained: A Visual and Intuitive Guide

🚀

353

How Language Models Turn Text into Meaning, From Traditional

liked a dataset 2 months ago

llamaindex/ParseBench

Benchmark • Updated Apr 19 • 169k • 21.2k • 95

New activity in naver-hyperclovax/HyperCLOVAX-SEED-Think-32B 3 months ago

HyperCLOVAX-SEED-32B 모델의 `model_type`을 `hyperclovax_vision_v2`으로 변경 요청드립니다 (transformers PR 연계)

#11 opened 3 months ago by

jp1924

test

#10 opened 3 months ago by

jp1924

liked a dataset 4 months ago

aiqwe/FinShibainu

Viewer • Updated Dec 18, 2024 • 87.4k • 74 • 7

liked a Space 4 months ago

CircleCI Test Collection Helper Space

📊

Query test results for a PR

updated a dataset 4 months ago

jp1924/PatternedUtteranceWithNumber

Preview • Updated Feb 25 • 1.24k

jp1924

AI & ML interests

Recent Activity

Organizations

jp1924's activity

Update chat_template.jinja

LLM Embeddings Explained: A Visual and Intuitive Guide

HyperCLOVAX-SEED-32B 모델의 `model_type`을 `hyperclovax_vision_v2`으로 변경 요청드립니다 (transformers PR 연계)

test

CircleCI Test Collection Helper Space