Bolian Li

lblaoke

·

https://lblaoke.github.io/

AI & ML interests

None yet

Recent Activity

authored a paper about 2 months ago

More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment

authored a paper about 2 months ago

DRIFT: Learning from Abundant User Dissatisfaction in Real-World Preference Learning

authored a paper about 2 months ago

Learning Self-Correction in Vision-Language Models via Rollout Augmentation

View all activity

Organizations

lblaoke 's models 44

lblaoke/mistral-v0.1-7b-ppo-self

7B • Updated Feb 4, 2025 • 1

lblaoke/mistral-v0.1-7b-ppo-human

7B • Updated Feb 4, 2025 • 1

lblaoke/llama2-7b-ppo-self-human

7B • Updated Feb 3, 2025 • 2

lblaoke/llama2-7b-ppo-self

7B • Updated Feb 3, 2025 • 3

lblaoke/llama2-7b-ppo-human

7B • Updated Feb 3, 2025 • 1

lblaoke/mistral-v0.3-7b-rm-human

Text Classification • 7B • Updated Jan 14, 2025 • 3

lblaoke/mistral-v0.3-7b-rm-self-human

Text Classification • 7B • Updated Jan 14, 2025 • 2

lblaoke/mistral-v0.3-7b-rm-self

Text Classification • 7B • Updated Jan 14, 2025 • 2

lblaoke/mistral-v0.1-7b-rm-self-human

Text Classification • 7B • Updated Jan 14, 2025 • 1

lblaoke/mistral-v0.1-7b-rm-self

Text Classification • 7B • Updated Jan 14, 2025 • 5

lblaoke/llama2-7b-rm-self

Text Classification • 7B • Updated Jan 14, 2025 • 2

lblaoke/mistral-v0.1-7b-rm-human

Text Classification • 7B • Updated Jan 14, 2025 • 3

lblaoke/llama2-7b-rm-human

Text Classification • 7B • Updated Jan 14, 2025 • 4

lblaoke/llama2-7b-rm-self-human

Text Classification • 7B • Updated Jan 13, 2025 • 1