langfeng01
/

GiGPO-Qwen2.5-7B-Instruct-WebShop

Model card Files Files and versions

langfeng01 commited on Sep 28, 2025

Commit

f1cbb12

·

verified ·

1 Parent(s): b6b803b

Update README.md

Files changed (1) hide show

README.md +34 -1

README.md CHANGED Viewed

@@ -4,7 +4,40 @@ base_model:
 - Qwen/Qwen2.5-7B-Instruct
 ---
-To use this model, please refer to [verl-agent](https://github.com/langfengQ/verl-agent).
 `GiGPO-Qwen2.5-7B-Instruct-WebShop` is trained using [GiGPO](https://huggingface.co/papers/2505.10978) and the following prompt:

 - Qwen/Qwen2.5-7B-Instruct
 ---
+<p align="center">
+    <img src="./logo-verl-agent.png" alt="logo" width="55%">
+</p>
+<p align="center">
+  <a href="https://arxiv.org/abs/2505.10978">
+    <img src="https://img.shields.io/badge/arXiv-Paper-red?style=flat-square&logo=arxiv" alt="arXiv Paper"></a>
+  &nbsp;
+  <a href="https://github.com/langfengQ/verl-agent">
+    <img src="https://img.shields.io/badge/GitHub-Project-181717?style=flat-square&logo=github" alt="GitHub Project"></a>
+  &nbsp;
+  <a href="https://huggingface.co/collections/langfeng01/verl-agent-684970e8f51babe2a6d98554">
+    <img src="https://img.shields.io/badge/HuggingFace-Models-yellow?style=flat-square&logo=huggingface" alt="HuggingFace Models"></a>
+  &nbsp;
+  <a href="https://x.com/langfengq/status/1930848580505620677">
+    <img src="https://img.shields.io/badge/Twitter-Channel-000000?style=flat-square&logo=x" alt="X Channel"></a>
+</p>
+## Quick Start
+To use this model, follow these three steps:
+1. Clone [verl-agent](https://github.com/langfengQ/verl-agent).
+2. Set [`actor_rollout_ref.model.path`](https://github.com/langfengQ/verl-agent/blob/35b3da38293993f9bf4f7873dfb3262a361e956c/examples/gigpo_trainer/run_webshop.sh#L30) to your local path, e.g. `your/own/path/GiGPO-Qwen2.5-7B-Instruct-ALFWorld`.
+3. Ensure [`trainer.val_before_train=True`](https://github.com/langfengQ/verl-agent/blob/35b3da38293993f9bf4f7873dfb3262a361e956c/examples/gigpo_trainer/run_webshop.sh#L72), so evaluation runs before training.
+For more details, please refer to the [verl-agent](https://github.com/langfengQ/verl-agent).
+---
+## Notes
 `GiGPO-Qwen2.5-7B-Instruct-WebShop` is trained using [GiGPO](https://huggingface.co/papers/2505.10978) and the following prompt: