Update README.md
Browse files
README.md
CHANGED
|
@@ -17,7 +17,6 @@ This is the first study of applying Reinforcement Learning with Verifiable Rewar
|
|
| 17 |
open-ended and subjective image captioning task. Unlike traditional Supervised Fine-Tuning, which
|
| 18 |
can lead to models memorizing a limited set of annotated captions, our method allows the model to
|
| 19 |
explore and generate a broader range of creative and general descriptions.
|
| 20 |
-
|
| 21 |
CapRL is a new training paradigm featuring a decoupled two-stage pipeline. The initial
|
| 22 |
stage uses LVLMs to generate rich and accurate captions. Subsequently, the second stage evaluates
|
| 23 |
caption quality by using a vision-only LLM to perform the QA task. We also created a specific QA
|
|
|
|
| 17 |
open-ended and subjective image captioning task. Unlike traditional Supervised Fine-Tuning, which
|
| 18 |
can lead to models memorizing a limited set of annotated captions, our method allows the model to
|
| 19 |
explore and generate a broader range of creative and general descriptions.
|
|
|
|
| 20 |
CapRL is a new training paradigm featuring a decoupled two-stage pipeline. The initial
|
| 21 |
stage uses LVLMs to generate rich and accurate captions. Subsequently, the second stage evaluates
|
| 22 |
caption quality by using a vision-only LLM to perform the QA task. We also created a specific QA
|