7 15 7

yilong xu

sapphirex

AI & ML interests

None yet

Recent Activity

upvoted a paper 7 days ago

From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning

upvoted a paper 13 days ago

Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning

upvoted a paper 15 days ago

Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It

View all activity

Organizations

upvoted a paper 7 days ago

From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning

Paper • 2606.17682 • Published 9 days ago • 26

upvoted a paper 13 days ago

Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning

Paper • 2606.13106 • Published 14 days ago • 21

upvoted a paper 15 days ago

Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It

Paper • 2606.11052 • Published 16 days ago • 16

upvoted a paper 21 days ago

MemTrain: Self-Supervised Context Memory Training

Paper • 2606.03197 • Published 23 days ago • 17

upvoted a paper 22 days ago

Trust Region On-Policy Distillation

Paper • 2606.01249 • Published 25 days ago • 44

upvoted a paper about 1 month ago

EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL

Paper • 2605.18703 • Published May 18 • 50

liked a dataset 3 months ago

Mosi-AI/LiveClawbench-trajectories

Updated 1 minute ago • 158 • 3

upvoted a paper 4 months ago

CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models

Paper • 2602.17684 • Published Feb 4 • 22

updated a dataset 5 months ago

sapphirex/lucene-msmarcov2.1

Updated Feb 8 • 3

published a dataset 5 months ago

sapphirex/lucene-msmarcov2.1

Updated Feb 8 • 3

upvoted a paper 5 months ago

MeKi: Memory-based Expert Knowledge Injection for Efficient LLM Scaling

Paper • 2602.03359 • Published Feb 3 • 10

upvoted a collection 8 months ago

Annotation-Efficient Universal Honesty Alignment

Collection

Official Collections of paper "Annotation-Efficient Universal Honesty Alignment". • 5 items • Updated Oct 21, 2025 • 3

upvoted a paper 8 months ago

Annotation-Efficient Universal Honesty Alignment

Paper • 2510.17509 • Published Oct 20, 2025 • 22

liked a model 8 months ago

lnm1p/search-gen-v-4b

Updated Oct 24, 2025 • 2

liked a dataset 8 months ago

lnm1p/Search-Gen-V

Viewer • Updated Oct 24, 2025 • 46.7k • 16 • 1

liked a model 9 months ago

openbmb/VoxCPM-0.5B

Text-to-Speech • Updated Sep 19, 2025 • 7.74k • 806

liked 3 models 10 months ago

google/embeddinggemma-300m

openbmb/MiniCPM4.1-8B

Text Generation • 8B • Updated Oct 24, 2025 • 53.2k • 391

openbmb/MiniCPM-V-4_5

Image-Text-to-Text • 9B • Updated Mar 10 • 85.9k • 1.09k

updated a dataset 10 months ago

sapphirex/RAVine-logs

Updated Aug 25, 2025 • 3.79k • 1

yilong xu

AI & ML interests

Recent Activity

Organizations

sapphirex's activity