amoeba04 commited on
Commit
96a238f
·
verified ·
1 Parent(s): e4824bb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -28
README.md CHANGED
@@ -179,6 +179,7 @@ Register in `vlmeval/config.py`:
179
  from functools import partial
180
  from vlmeval.vlm import InternVLChat
181
 
 
182
  "KVL-DPO": partial(InternVLChat, model_path="amoeba04/KVL-DPO", max_new_tokens=16384, version="V2.0"),
183
  ```
184
 
@@ -187,37 +188,12 @@ Run evaluation:
187
  python run.py --data MMBench_DEV_EN --model KVL-DPO --verbose
188
  ```
189
 
190
- ## Intended Use Cases
191
 
192
- - **Scientific Document Understanding**: Analysis of figures, tables, and diagrams in scientific papers
193
- - **Medical Image Analysis**: Interpretation of radiology, pathology, and endoscopy images
194
  - **Visual Question Answering**: General and domain-specific VQA tasks
195
  - **Chain-of-Thought Reasoning**: Complex visual reasoning with step-by-step explanations
196
- - **Human-Aligned Responses**: Improved response quality through preference optimization
197
-
198
- ## Model Comparison
199
-
200
- | Model | Training Method | Key Advantage |
201
- |-------|----------------|---------------|
202
- | KVL | SFT (4M samples) | Strong domain knowledge |
203
- | KVL-DPO | SFT + DPO | Better aligned with human preferences |
204
-
205
- ## License
206
-
207
- This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).
208
-
209
- ## Citation
210
-
211
- If you use this model, please cite:
212
- ```bibtex
213
- @misc{kvl-dpo,
214
- title={KVL-DPO: Vision-Language Model with Direct Preference Optimization},
215
- author={amoeba04},
216
- year={2025},
217
- publisher={Hugging Face},
218
- url={https://huggingface.co/amoeba04/KVL-DPO}
219
- }
220
- ```
221
 
222
  ## Acknowledgments
223
 
@@ -225,3 +201,7 @@ If you use this model, please cite:
225
  - [ms-swift](https://github.com/modelscope/ms-swift) - Training framework
226
  - [MMInstruction](https://huggingface.co/MMInstruction) - VLFeedback dataset
227
  - All dataset creators for their valuable contributions
 
 
 
 
 
179
  from functools import partial
180
  from vlmeval.vlm import InternVLChat
181
 
182
+ # Add to ungrouped dict
183
  "KVL-DPO": partial(InternVLChat, model_path="amoeba04/KVL-DPO", max_new_tokens=16384, version="V2.0"),
184
  ```
185
 
 
188
  python run.py --data MMBench_DEV_EN --model KVL-DPO --verbose
189
  ```
190
 
191
+ ## Intended Use
192
 
193
+ - **Scientific Document Understanding**: Analyzing figures, tables, and diagrams from scientific papers
194
+ - **Medical Image Analysis**: Radiology, pathology, and endoscopy image interpretation
195
  - **Visual Question Answering**: General and domain-specific VQA tasks
196
  - **Chain-of-Thought Reasoning**: Complex visual reasoning with step-by-step explanations
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
197
 
198
  ## Acknowledgments
199
 
 
201
  - [ms-swift](https://github.com/modelscope/ms-swift) - Training framework
202
  - [MMInstruction](https://huggingface.co/MMInstruction) - VLFeedback dataset
203
  - All dataset creators for their valuable contributions
204
+
205
+ ## License
206
+
207
+ This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).