huanghaizhi's picture

2 7

huanghaizhi

Hale-Sage

·

Hale-Sage

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training

upvoted a paper about 1 month ago

VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model

liked a model 3 months ago

Qwen/Qwen3-VL-2B-Thinking-FP8

View all activity

Organizations

None yet

upvoted 2 papers about 1 month ago

RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training

Paper • 2510.06710 • Published Oct 8, 2025 • 42

VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model

Paper • 2509.09372 • Published Sep 11, 2025 • 246

liked a model 3 months ago

Qwen/Qwen3-VL-2B-Thinking-FP8

Image-Text-to-Text • 2B • Updated Nov 26, 2025 • 1.2k • 22

liked a model 5 months ago

meta-llama/Llama-2-7b-hf

Text Generation • 7B • Updated Apr 17, 2024 • 435k • 2.26k

updated a dataset 6 months ago

Hale-Sage/unix

Viewer • Updated Aug 6, 2025 • 101k • 5

updated a model 6 months ago

Hale-Sage/ppo-CartPole-v1

Reinforcement Learning • Updated Aug 5, 2025

published a model 6 months ago

Hale-Sage/ppo-CartPole-v1

Reinforcement Learning • Updated Aug 5, 2025

reacted to sergiopaniego's post with 👍 6 months ago

Post

4555

Just included example scripts for aligning models using GSPO (including VLM example) 🙆‍♂️🙆‍♂️

GSPO is the latest RL alignment algo by @Alibaba_Qwen and it's already supported in the latest TRL v0.20 release.

Super-easy-to-get-started example scripts below, GO run them!👩‍💻👩‍💻

🧑‍🎨 Script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo.py
🦄 VLM script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo_vlm.py
🧩 More TRL examples: https://huggingface.co/docs/trl/main/en/example_overview
🧙‍♂️ GSPO paper: Group Sequence Policy Optimization (2507.18071)

updated a dataset 6 months ago

Hale-Sage/eval_koch

Viewer • Updated Jul 28, 2025 • 558 • 1

published 2 datasets 6 months ago

Hale-Sage/eval_koch

Viewer • Updated Jul 28, 2025 • 558 • 1

Hale-Sage/unix

Viewer • Updated Aug 6, 2025 • 101k • 5

liked a model 7 months ago

Hale-Sage/hhz-sd-class-butterflies-32

Updated Jul 21, 2025 • 2 • 1

liked 2 datasets 7 months ago

Hale-Sage/Grab_the_cube_3cam

Viewer • Updated Jul 11, 2025 • 28.6k • 49 • 1

Hale-Sage/merge_data

Viewer • Updated Jul 8, 2025 • 326k • 95 • 1

updated a model 7 months ago

Hale-Sage/hhz-sd-class-butterflies-32

Updated Jul 21, 2025 • 2 • 1

published a model 7 months ago

Hale-Sage/hhz-sd-class-butterflies-32

Updated Jul 21, 2025 • 2 • 1

updated a dataset 7 months ago

Hale-Sage/Grab_the_cube_2cam

Viewer • Updated Jul 11, 2025 • 27.7k • 63

published a dataset 7 months ago

Hale-Sage/Grab_the_cube_2cam

Viewer • Updated Jul 11, 2025 • 27.7k • 63

updated a dataset 7 months ago

Hale-Sage/Grab_the_cube_3cam

Viewer • Updated Jul 11, 2025 • 28.6k • 49 • 1

published a dataset 7 months ago

Hale-Sage/Grab_the_cube_3cam

Viewer • Updated Jul 11, 2025 • 28.6k • 49 • 1