Update README.md
Browse files
README.md
CHANGED
|
@@ -32,7 +32,7 @@ license: apache-2.0
|
|
| 32 |
**Archer2.0** marks a significant evolution from its predecessor through the introduction of **Asymmetric Importance Sampling Policy Optimization (ASPO)**, which is designed to overcome the fundamental limitations of **PPO-Clip**, effectively mitigating issues like **entropy collapse** and **repetitive outputs**, preventing **premature convergence**, and thereby enabling more advanced **reinforcement learning** capabilities.
|
| 33 |
|
| 34 |
<div align="center">
|
| 35 |
-
<img src="
|
| 36 |
</div>
|
| 37 |
<br>
|
| 38 |
|
|
|
|
| 32 |
**Archer2.0** marks a significant evolution from its predecessor through the introduction of **Asymmetric Importance Sampling Policy Optimization (ASPO)**, which is designed to overcome the fundamental limitations of **PPO-Clip**, effectively mitigating issues like **entropy collapse** and **repetitive outputs**, preventing **premature convergence**, and thereby enabling more advanced **reinforcement learning** capabilities.
|
| 33 |
|
| 34 |
<div align="center">
|
| 35 |
+
<img src="archer2.0.png" width="100%"/>
|
| 36 |
</div>
|
| 37 |
<br>
|
| 38 |
|