ScaleBio Baseline

university

https://huggingface.co/RolandMinrui

AI & ML interests

None defined yet.

Recent Activity

shizhuo2 submitted a paper 3 days ago

Useful Memories Become Faulty When Continuously Updated by LLMs

research4pan authored a paper 25 days ago

AgentSPEX: An Agent SPecification and EXecution Language

research4pan submitted a paper 25 days ago

AgentSPEX: An Agent SPecification and EXecution Language

View all activity

submitted a paper to Daily Papers 3 days ago

Useful Memories Become Faulty When Continuously Updated by LLMs

Paper • 2605.12978 • Published 4 days ago • 18

authored a paper 25 days ago

AgentSPEX: An Agent SPecification and EXecution Language

Paper • 2604.13346 • Published Apr 14 • 164

submitted a paper to Daily Papers 25 days ago

AgentSPEX: An Agent SPecification and EXecution Language

Paper • 2604.13346 • Published Apr 14 • 164

submitted a paper to Daily Papers 3 months ago

Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning

Paper • 2602.01058 • Published Feb 1 • 44

authored a paper 7 months ago

GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving

Paper • 2510.11769 • Published Oct 13, 2025 • 26

authored 2 papers about 1 year ago

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Paper • 2505.02391 • Published May 5, 2025 • 25

Self-rewarding correction for mathematical reasoning

Paper • 2502.19613 • Published Feb 26, 2025 • 82

authored a paper over 1 year ago

Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published Oct 9, 2024 • 70

authored a paper over 1 year ago

$\textbf{Only-IF}$:Revealing the Decisive Effect of Instruction Diversity on Generalization

Paper • 2410.04717 • Published Oct 7, 2024 • 18

authored 3 papers almost 2 years ago

Instruction Diversity Drives Generalization To Unseen Tasks

Paper • 2402.10891 • Published Feb 16, 2024

PACE-LM: Prompting and Augmentation for Calibrated Confidence Estimation with GPT-4 in Cloud Incident Root Cause Analysis

Paper • 2309.05833 • Published Sep 11, 2023

PLUM: Preference Learning Plus Test Cases Yields Better Code Language Models

Paper • 2406.06887 • Published Jun 11, 2024 • 2

authored 7 papers about 2 years ago

Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise

Paper • 2312.14567 • Published Dec 22, 2023 • 1

LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models

Paper • 2306.12420 • Published Jun 21, 2023 • 2

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Paper • 2304.06767 • Published Apr 13, 2023 • 2

DetGPT: Detect What You Need via Reasoning

Paper • 2305.14167 • Published May 23, 2023

AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets

Paper • 2401.01916 • Published Jan 3, 2024 • 1

Plum: Prompt Learning using Metaheuristic

Paper • 2311.08364 • Published Nov 14, 2023

LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning

Paper • 2403.17919 • Published Mar 26, 2024 • 16

authored a paper over 2 years ago

Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion

Paper • 2401.12947 • Published Jan 23, 2024 • 4