Tianjin University

university

http://www.tju.edu.cn/english/

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

lblaoke authored a paper 26 days ago

More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment

lblaoke authored a paper 26 days ago

DRIFT: Learning from Abundant User Dissatisfaction in Real-World Preference Learning

lblaoke authored a paper 26 days ago

Learning Self-Correction in Vision-Language Models via Rollout Augmentation

View all activity

Papers

KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance

WiT: Waypoint Diffusion Transformers via Trajectory Conflict Navigation

View all Papers

authored 5 papers 26 days ago

More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment

Paper • 2504.02193 • Published Apr 3, 2025 • 1

DRIFT: Learning from Abundant User Dissatisfaction in Real-World Preference Learning

Paper • 2510.02341 • Published Sep 27, 2025 • 4

Learning Self-Correction in Vision-Language Models via Rollout Augmentation

Paper • 2602.08503 • Published Feb 9 • 3

Why Reasoning Fails to Plan: A Planning-Centric Analysis of Long-Horizon Decision Making in LLM Agents

Paper • 2601.22311 • Published Jan 29

Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control

Paper • 2604.26326 • Published 29 days ago • 14

submitted a paper to Daily Papers 26 days ago

Missing Old Logits in Asynchronous Agentic RL: Semantic Mismatch and Repair Methods for Off-Policy Correction

Paper • 2605.12070 • Published 27 days ago • 16

submitted a paper to Daily Papers 26 days ago

Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control

Paper • 2604.26326 • Published 29 days ago • 14

submitted a paper to Daily Papers about 2 months ago

KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance

Paper • 2604.12627 • Published Apr 14 • 101

submitted a paper to Daily Papers about 2 months ago

SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting

Paper • 2604.10688 • Published Apr 12 • 26

submitted a paper to Daily Papers 3 months ago

WiT: Waypoint Diffusion Transformers via Trajectory Conflict Navigation

Paper • 2603.15132 • Published Mar 16 • 35

authored a paper 4 months ago

ERNIE 5.0 Technical Report

Paper • 2602.04705 • Published Feb 4 • 269

authored a paper 8 months ago

Knowledge-Level Consistency Reinforcement Learning: Dual-Fact Alignment for Long-Form Factuality

Paper • 2509.23765 • Published Sep 28, 2025 • 3

authored 8 papers about 1 year ago

ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time

Paper • 2410.06625 • Published Oct 9, 2024 • 1

Bayesian Computation in Deep Learning

Paper • 2502.18300 • Published Feb 25, 2025

Cascade Reward Sampling for Efficient Decoding-Time Alignment

Paper • 2406.16306 • Published Jun 24, 2024 • 1

Entropy-MCMC: Sampling from Flat Basins with Ease

Paper • 2310.05401 • Published Oct 9, 2023

Long-tailed Classification from a Bayesian-decision-theory Perspective

Paper • 2303.06075 • Published Mar 10, 2023

Trustworthy Long-Tailed Classification

Paper • 2111.09030 • Published Nov 17, 2021

Graph Communal Contrastive Learning

Paper • 2110.14863 • Published Oct 28, 2021

Identifying Incorrect Classifications with Balanced Uncertainty

Paper • 2110.08030 • Published Oct 15, 2021